When Bad Architectures Happen to Good People: Migrating Legacy Systems to Serverless (Trek10) | Datadog
Datadog's Research Report: The State of Serverless Report: The State of Serverless
When Bad Architectures Happen to Good People: Migrating Legacy Systems to Serverless (Trek10)

When Bad Architectures Happen to Good People: Migrating Legacy Systems to Serverless (Trek10)


Published: July 17, 2019
00:00:00
00:00:00

Well, good afternoon. It is great to be here in New York.

I spend a lot of time helping companies migrate applications to the cloud, it’s what I do, and particularly what we sometimes refer to as cloud-native or serverless type of architectures.

What people say (versus what they mean) when it comes to serverless migrations

I’ve done this so frequently that I’ve started to hear some of the same concerns over and over again. Because of that, and because I’m an engineer, I’ve developed what I call the Serverless Migration Translator.

The way it works is you put in at one end what somebody says when they’re looking at a serverless migration, and, of course, what comes out the other end is what they mean.

Would you like to see it in action?

Together: Yes.

Forrest: All right, good, because I’m going to definitely show it to you.

Alright, so I hear this a lot, “We need to be cloud-agnostic.”

This is something that people say a lot. Turns out, you put that through the Serverless Migration Translator, what it sometimes means is, “Kubernetes will look great on our resumes.”

Right?

“We’re not too concerned about cost right now.” I hear that a lot at the beginning of a project where you’re just focused on getting things into the cloud by any means necessary.

That usually translates to, “We’re about to be extremely concerned about costs.”

“The DevOps team will own that.”

What do you think that translates to?

“Nobody will own that.”

“That’s impossible.”

Hear this a lot, especially with serverless. Especially when you’re looking at existing architectures, you’re trying to figure out how to refactor.

That usually just means, “I don’t know how to do that,” right?

Whereas that’ll be easy, on the other end also translates to, “I don’t know how to do that.”

And then finally, last but not least,

“Let’s focus on quick wins.”

People say this a lot. It sounds great because you’re thinking, “How can I provide value for the business more quickly,” right?

But a lot of times, unfortunately, that means, “Maybe if we ignore the giant ball of mud, it’ll become somebody else’s problem.”

The secret to successful serverless migrations

I’m not bringing all these things up to like dunk on people who have trouble with serverless migrations because we’ve all been there, all right?

It’s not that people are incompetent or that they’re, you know, not wanting to have success, it’s hard to get this right.

I’m going to let you in on a little secret, though.

The Serverless Migration Translator actually works on a very simple principle. Well, maybe not simple, but at least I can fit it on one slide.

I call it the law of serverless migrations.

Here’s how it goes:

Organizations tend to oversimplify what they should change.

So, that means, if you’re looking at re-architecting, refactoring something, you think you’re not going to need to do it. You’re definitely going to need to do that in the serverless world.

On the other hand, they tend to overcomplicate what they should control.

So when you’re looking, “Should I build something out myself, or should I build on a service?”

You probably are going to need more services than you think.

You oversimplify what you should change, you over-complicate what you should control. That’s the way to fail at a serverless migration

And if that sounds a little esoteric or hard to get your head around, what we’re going to do here in this half-hour, so we’re going to look at some really practical, some strategic, and some tactical things that you can do, both from a technical and from a human perspective, to make that migration successful at your organization.

Why should organizations use serverless?

As we get into that though, I think we really quickly need to talk about what and why when it comes to serverless, right?

It’s a buzzword, I get it. It’s hypey.

You’ve probably heard it tossed around at your company. Everybody in this room probably has a little different definition in their head of what it means.

When I see this actually being used and being successful in the tech world, a lot of times it’s less of a technical discussion, it’s less of a technical word, and it’s more driven by business considerations.

In particular, there’s two things that I see companies thinking about when they’re making a move toward a serverless migration.

Number one, decreased time to market for new applications and features. So, I want to get code off my developers’ laptops more quickly, I want to move it into the cloud and be serving those features to users faster.

That’s the dream. I’m not saying this always happens, I’m just saying that’s why people choose to go this route.

And then the other thing is, decreased total cost of ownership or TCO.

You may have heard that referred to in the context of infrastructure spend. I’m spending X-amount of dollars on EC2 right now, I’m going to move Y-amount of things to Lambda functions, and then I’m going to have Z-amount of cost savings, all right?

Unfortunately, certainly, once you are building at scale, you’ll find that you have a less of a cost savings, right? Those managed services will take up more.

The hidden costs of serverless

But that’s only a small portion of the TCO calculus.

And, I try to visualize this with this little iceberg diagram here.

Fair warning, by the way: I draw a cartoon series for A Cloud Guru. Some of you’ve probably seen it, and there will definitely be a lot of cartoons in this presentation, so just prepare yourself now.

Serverless costs, you’ll see that iceberg on the left is so high above the waterline. There’s potentially more that you’re paying for those managed services upfront, but look at everything below the waterline for a traditional architecture.

The maintenance and security as you’re building out stuff that perhaps doesn’t directly impact your business. The slower innovation because you’re taking more engineering hours away from that at a focus on the infrastructure. So, you know, and then the SS enterprise there, the little ship in the middle.

The challenges of legacy infrastructure: the LegacyCo case study

Either of those icebergs is more than big enough to sink that poor ship.

The question is, where do you want to be traveling towards?

And I think a lot of the people asking this question are folks who are looking at what I would call a greenfield architecture.

You’re trying to figure out serverless, you’re googling it. You’re seeing a lot of, Hello-world tutorials. You’re seeing use cases and examples from people who built their serverless startup in six weeks or something like that.

And that’s awesome. You can tell a great story that way, but the reality is, most problems aren’t greenfield, right?

They’re legacy. They have existing constraints.They have existing customers and clients that you can disrupt willy-nilly by choosing whatever fly-by-night technology happens to be exciting your Dev team at the moment.

So let me introduce you to LegacyCo. How can LegacyCo take advantage of serverless?

Obviously, LegacyCo is not a real company. The names have been changed to protect the innocent in this case, but this is based on a number of real organizations that I’ve been working with over the past 12 to 18 months here at Trek10.

And LegacyCo is a large company that’s got some very unique but perhaps we could say common problems to enterprises. Diverse systems and tools, some are decades old in some cases.

They’ve been building for a lot of time. Regulatory and customer constraints exist. They can’t just change things willy-nilly. They’ve got a widely disparate level of maturity across teams.

I think this is key to understand. There is no single level of technical maturity at a company the size of LegacyCo, which may be across the country, maybe across continents.

You’ve got some teams that are very forward-looking, you’ve got other teams that are definitely lagging behind.

And then, finally, of course, this software that LegacyCo creates is deployed in lots of different ways, mainframes, perhaps certainly some Legacy data centers, maybe even on-premise deployments where you’ve sold software to customers, they’re running it in whatever their own architecture is, and now you’re going to bring that back in, and you’re going to run it for them as a SaaS application.

Because of these unique considerations, LegacyCo has some unique technical challenges. New feature development, unfortunately, in some cases, has slowed or even stopped.

And that happens not because there aren’t lots of people working at LegacyCo who need things to do, it happens because, unfortunately, as these applications have grown over time and more features have been bolted on, nobody really understands how the system works anymore.

Especially at a low level, there is no one person who can hold the system in their heads.

There is perhaps not good testing, and so what happens is, over time, because nobody has an end-to-end view of the systems, there develops this mindset of, “Let’s just keep the lights on.”

You ever heard that phrase used? “Let’s just keep the lights on, right? We’re not going to make any big changes, we’re not going to rock the boat. We’re just going to hopefully keep things from deteriorating any further.”

LegacyCo’s soft challenges

As you can imagine, with technical challenges like that, there are some human challenges at LegacyCo.

It’s like, I always chuckle when people say their teams have no time to innovate, right?

Because on the one hand, they say they’re too busy firefighting, on the other hand, they are too busy keeping the lights on, and the way they’re keeping the lights on is by setting more fires.

So the human challenges at LegacyCo then.

It’s difficult, in some cases, for them to attract and retain talent unfortunately.

The folks that are perhaps earlier in their careers who have more choice over where they work, they may say, “Well, I don’t want to deal with the political infighting that’s taking place here.

I don’t want to deal with these Legacy technologies that might not look as great on my resume. I’m going to head somewhere else.” So there’s lots of attrition as people that can leave do leave.

And so you ended up with this really frustrating catch-22 where, on the one hand, you have a decreasing utility of your systems, by which I mean, you can take a lot of time to, you know, build things that don’t really help you.

Meanwhile, your competitors are moving faster, more nimbly, and, at the same time, you’ve got a decrease in capacity to execute meaningful change. The people that can turn you around just aren’t there anymore or they’re hamstrung and not able innovate.

And so finally, change itself begins to look overwhelming.

You ever been in this situation? You wouldn’t even know where to start trying to get out of this.

But LegacyCo understands, as any business does, that you can’t stay in place.

They’re looking at the TCO and time-to-market innovations that their competitors have, and they’re saying, “How can I take advantage of this? I know I can’t stay here forever.”

So, how does an organization like LegacyCo even begin to adopt serverless?

Event sourcing: the first step to legacy migrations

We’re going to look at some technical and some human things here, and I’m going to start with the technical challenges because, frankly, I think they’re easier to solve.

I’m going to talk you through a few architectures, a few strategies that we’ve used with some success here at Trek10 to start moving these Legacy systems to serverless.

One of them is turning Legacy databases into event sources.

Who here is familiar with the concept of event sourcing?

Lower your hand if you’re actually using event sourcing successfully.

Okay.

I probably said that backwards.

But anyway, the point is, it’s difficult to do, and this is not meant to be a talk on event sourcing. There’s lots of great articles and books out there that you can look into, I can point you towards.

What I’m mainly trying to get to here is the idea that an event sourcing app is basically: you’ve got the concept of events. They’re things that happen in the world.

You want to transform them into structured pieces of data, you want to stream them into the cloud, and then you want to build consumer applications that do things with them downstream. That’s the super high-level overview of what event sourcing tries to do.

Most legacy apps aren’t built that way. They’re monolithic, they handle data in a different way.

And so if you want to get stuff into the cloud, what we often recommend people do is try to build some kind of an agent on-premise that can stream changes to that legacy database to the cloud as events.

And I’m showing this here on this slide using something called CloudWatch Events, which is an AWS service. It’s just recently, in the last week or so, been kind of rebranded as Amazon Eventbridge.

That just came out of the New York Summit here last week. But basically, it’s Amazon’s in-house event bus.

And there’s other things you could use here, Kinesis, SNS, SQS, that’s just inside of AWS.

Point is, there are lots of ways to get events into the cloud. Once you have them there, you stream them, you consume them via something like Lambda.

And there, you can do all kinds of things.

You can do, you know, reporting, you can do data warehousing, you can build read-only APIs.

What we see a lot of clients doing is building new systems off of these events that are not write-based. So their main source of truth is still in the Legacy data center, but what they’re doing is they’re getting their feet wet.

They’re taking some load off of that Legacy database.

They’re increasing, hopefully, the performance there, and they’re also gaining knowledge and expertise about what the cloud can do.

So that’s high-level, and that doesn’t get you all the way there, right? Because your Legacy database, as I said, it’s still the single source of truth.

Strangling Legacy APIs: the next step

Strangling Legacy APIs is the other piece of this that I think is probably more interesting.

Who is familiar with the strangler pattern?

It goes back a number of years now, a few people are.

This is the idea that you’ve got a Legacy system, you want to move people to a new version that’s going to require some significant re-architecture and refactoring, so it’s almost like a fork of your codebase or a fork of your architecture.

So what you do is you place a wrapper (or sometimes we call it a facade) in front of that V1 system, and then you place your V2 system behind that facade as well.

Over time, you route more and more of your customers to the V2 system and eventually, ideally, you unplug V1 altogether and nobody notices.

That’s always the way that happens, right? Just works smoothly that way.

The diagram I’m showing you on the screen now is sort of an implementation of that in AWS serverless context, and it’s quite similar to something that we designed for a client just a few months ago.

In their case they had a legacy database, I think it was a MongoDB Atlas. I know that sounds hardly legacy probably compared to some things that you might be working with.

But they had a legacy API in front of that, and they wanted to move to a more traditional serverless architecture, so think API gateway, DynamoDB, Lambda.

And the way we built this, you’ll notice there’s two versions of this API here.

On top is the new version. It’s all those services I just mentioned.

Your data flows through Lambda to DynamoDB, but it doesn’t stop there. It goes through DynamoDB streams, it goes through to Kinesis.

And once it gets through that streaming section, now you’ve created an event-source system like what I described on the previous slide.

And, you can do all sorts of things with those events. You can go to Kinesis Firehose to S3 for some kind of a reporting application like Athena, you could put in front of that.

You can have a Lambda consumer that goes to Elasticsearch if you want to get full-tech search behind this.

But most importantly, you can take a Lambda function, and you can stream it out to your old system, your existing database.

So, they’re writing their data back, in this case, to MongoDB. And what that gives them is the ability for their V1 customers to have an updated view of what’s going on in the world while their V2 customers, which they forked over and said, “Hey, you know, new people who are using this system, you need to start using the V2 API. They’re getting all the latest and greatest features.”

The importance of re-architecturing and re-factoring your systems

Gradually, the goal here is to migrate to the new version. It’s to bring people off of V1 entirely.

And hopefully, if it’s not been clear to you already, where I’m getting with all of this is to say you can’t build a system like this without significant re-architecturing and refactoring.

There is no V2 of that serverless system that doesn’t involve you rewriting that API.

That’s not necessarily a bad thing, it’s just the reality of a serverless migration. So, you’ve got to think carefully about where you’re spending your time and what’s providing value for you.

And one of the ways I try to get people over that hump, because that can be a real mental block, right? “How am I going to actually make significant inroads in a reasonable amount of time when that problem looks so overwhelming?”

One of the things I try to get people to do is to detangle the database. If you run a large production database, there is a spectrum of things that are there.

A legacy relational database has things in it that are good, and appropriate, and that you’re proud about like the prod transactions and the backup jobs, and there are things you just don’t talk about.

There is the half-baked graph database implementation, and the clear text passwords, and the Dev data, and all these things, right?

And those aren’t there because our Devs are incompetent, again they’re there because it’s a Legacy system.

You are limited in the options that you have, but you’ve got this database here, your ORM is already set up to talk to it, you’ve got storage already provisioned for it, why wouldn’t you use it?

So, over time, all of these things creep in that probably shouldn’t be there.

And it turns out some of them are relatively easy. I say “relatively” easy to disentangle.

So I start asking questions like: What access patterns would better fit a cloud-native data store? Do you have a 70-gigabyte table of user access logs that’s growing and growing inside your relational database?

There’s nothing relational about that data, that could easily go somewhere like DynamoDB.

Takes load off your system again, allows you to use more of your scarce resources for things that that database is better suited for, and it’s perhaps not as hard to disentangle as a core part of your application.

Can you move any business logic from sproc’s stored procedures to something like Lambda functions?

There’s a lot of businesses that have a couple of stored procedures that are secretly running most of their business logic, and, of course, those tend to be very tightly coupled to the database platform and the database version.

They are not very portable.

Moving to some sort of a well-supported third-party system like Lambda is usually better in those cases.

And not to mention, it’s, in a lot of cases, easier because that’s a fairly encapsulated piece of logic already. It’s a single task to pull it out.

Interesting, there is also, by the way, a lot of these stored procedures are not actually doing things that need to be done anymore.

I was working with a client a few months ago that had stored procedures that were doing OCR on insurance forms. You know, that needed to be done at that time, but there’s managed services now that do OCR.

It’s a lot easier to bring that out and have a managed service do it. It cuts down your code significantly.

Can you set up lifecycle rules to reduce costs?

Hopefully, this is obvious, but a lot of folks are serializing objects, placing them as raw binary data in their databases.

It’s much better to put that someplace like S3.

And, of course, you know, you can decrease the storage classes over time to save money there not put that in your main database.

But overall, what are the low risk and high reward targets here?

Way back at the beginning when we were talking about the Serverless Migration Translator, I was making the point that people say they want to focus on quick wins.

And what they really mean by that is they want to take a trivial piece of the application to refactor and just ignore the big piece because it seems too overwhelming.

That’s not what I’m advocating here.

Instead, what I’m saying is, find a piece that does provide meaningful value, but that you can break off without having to completely undo everything that’s been created for the entire lifecycle of the application up to this point.

And some of these types of data that I’ve identified really do fit that pattern.

The human factor—and its outsize role in the success or failure of legacy migrations

So back then to the law of serverless migrations: organizations tend to oversimplify what they should change.

They tend to think they need to do less re-architecturing and refactoring when in reality they need to do more.

And most of these technical things we’ve talked about fit that bill.

The over-complication comes in when they start worrying about control, worrying about what they should build versus buy, and I really feel like that’s more of a human challenge than a technical challenge.

This is where you have to start convincing people that the direction you’re going makes sense.

We’re here at DASH, which tells me something about most of us in this room. We’re fairly forward-thinking people.

I imagine you’re here because you want to learn about the next generation of operating your infrastructure and perhaps looking to move away from some Legacy things. And I would also bet that you work near someone who does not share that commitment to forward-thinking.

Is that fair to say?

You don’t have to out yourself or the person sitting next to you.

They might be right here in this room.

That’s okay. We’ve all been there.

Just like with the serverless migrations, I’ve worked with enough of those people that a lot of them start to, not blend together exactly, but I start to identify some personas.

So, I want to share a few personas with you. These are people that I’ve seen have significant challenges getting on board with serverless.

The server-hugging sysadmin is one of these people. This is someone who, for 20 years or more, has been building Linux boxes.

They’re very, very happy with their Ansible playbooks or their, you know, infrastructure that they’ve been automating, and they simply don’t feel good about relinquishing that control and giving it back to a cloud provider.

The disrupted developer.

A lot of empathy for this person.

Someone who maybe doesn’t care at all about what infrastructure their code is running on, they just want to be able to write code quickly and productively. They want their software development lifecycle to be put together correctly, they want to be able to test where they mean to test.

They don’t want to be locked out when they have to fly somewhere, you know, they want to be able to develop offline, and they’re concerned that serverless will take some of that productivity away from them.

The legacy leader, someone who’s pointing backward when the way is forward.

Those of us who are more, I would say, individual contributors, sometimes we have a lot of trouble understanding why someone would be standing in the way of progress that seems so clear to us from a technical perspective.

But a lot of leadership often has constraints around purchasing choices that they’ve made, political choices that they made where they’ve stuck their neck out for a particular type of technology, and there are real consequences to going back on that now.

They may be completely hamstrung by some other division in the organization, so a lot of times they’re not able to move as quickly as we would want, or maybe even as they would want.

The antagonistic architect, finally.

A lot of us may fall into this category, and it’s not necessarily a bad thing. It’s someone who is tasked with the health of an application, tasked with the performance, tasked with the scale, with making sure that you have the flexibility that’s needed to make changes.

This is someone who thinks a lot about what kind of technical choices to make, and so they think a lot about serverless and they might not think much of it.

All of these people share something in common, which is, they have an element of control over their application, and, through that, over the technical choices that the organization makes.

And because of that, they tend to have predictable struggles getting on board with serverless.

And I think a lot of that stems from the concept of DevOps.

We’ve been doing DevOps now, broadly speaking, for over a decade, right?

So we’ve spent a lot of time learning about containers, and autoscaling, and load balancing, and all these best practices that are very much focused on getting good at doing things with servers.

Sure, there’s abstraction layers on top of that, but ultimately, we want to own infrastructure and get really good at operating at its scale.

It’s not that those best practices have gone away, but what’s changed, the sea change that we’re seeing over the last five years or so is that those best practices increasingly are baked into the services that we build on.

I’m not saying they’re going away, I’m just saying they’re abstracted a little bit.

So think about the traditional DevOps responsibilities, which are things I have in bold here on the left side, and you’ll notice that there is, of course, the cloud provider.

Think about this in a traditional infrastructure-as-a-service context.

The cloud providers handling the hardware, everything above that, you’ve got to take care of yourself: the autoscaling, the load balancing, the high availability.

They’ll give you services, but you’ve got to tie them together and make sure they work for you according to your requirements.

In the serverless world, a lot of that, it’s all still there, it’s just happening under the hood.

Think about a service like Lambda.

Yes, you’re bringing the code, but that execution environment is managed for you, the auto-scaling, the load balancing, the high availability. You even get an API out of the box with Lambda, by the way.

If you’re controlling the client, you can hit Lambda directly via the SDK.

You don’t need something like API Gateway in the middle.

There’s some fantastically powerful services there.

Some possible concerns about serverless

And I think that is frankly threatening to some people, and it’s why you see these tribes forming around serverless, around containers.

This is not a serverless versus containers discussion, not getting in the middle of that one, but I think it’s not a technical conversation so much as it’s a very human one.

We are social creatures. We tend to develop into tribes.

And inside those tribes, we look for ways to, unfortunately, feel superior to people that are in a different tribe.

So what that means is if I feel really comfortable with containers, if I feel really comfortable with container orchestration, with Kubernetes and I want to look for justifications for my choices, I’ll try to find ways to feel superior to another group, and I’ll try to tell stories that justify the tech choices that I’m making.

One of the stories that you’ll hear a lot is the story of lock-in.

Who has heard this story (or told this story)?

Yeah, it’s a concern that a lot of people have.

The traditional concerns you hear are: “If I go to a more serverless model where I’m consuming more managed services, is the cloud provider that I’m building on going to go out of business?

“Are they going to deprecate a service that I rely on?

“Are they going to get me into their walled garden and then jack up the prices?”

These are all things that have happened at some point or another in the history of technology. So it’s not that they’re not concerns, it’s that you have to effectively think about your risk.

The fourth concern, by the way, which I’m starting to hear more and more, is, “What if this cloud provider that I’ve adopted is nefarious to some of my customers and they would prefer that I use a different cloud platform if I’m going to close deals with them?”

I’m not going to meddle that one either. That’s something you’ll have to figure out with your clients.

But I think a lot of the technical arguments that can be made for lock-in are things that can be very, not dismissed, they can be…you’ve just got to think about your risk, okay?

If I’m LegacyCo, which we’ve been talking about this whole half hour, if I’m LegacyCo, and I have the opportunity to move to a third-party platform that’s well supported, that may look a lot better for me than being stuck like this guy, buried under mountains of Legacy hardware and technical debt.

The cloud lock-in becomes less of a concern at that point.

Island-hopping: a strategy to build buy-in

So you can have these conversations with the antagonistic architect, and the server-hugging sysadmin, and all that, but you’re not going to convince everyone.

You’re not going to convince everyone that serverless is the path forward. There’s going to be folks who aren’t so easily convinced. And what I recommend doing there is something that I call island hopping.

The gentlemen you’re seeing on the screen now are the farthest thing from Legacy leaders. This is Douglas MacArthur and Chester Nimitz who were the leaders of the Pacific theater on the American side during World War II, and they had a problem.

Their problem was, they were stuck out in the Central Pacific or down in Australia.

They needed to work up this archipelago with heavily-defended islands to the home islands of Japan. That’s how they were going to win the war.

And what they decided to do was not to attack every heavily-fortified island in order, which would seem to make sense.

Instead, they hopped around some of those heavily-defended islands, which you’ll see denoted here by the beautiful canons that I have drawn on that one island.

They hopped around that island, and they planted their flags on islands that may have had some defenses there, but they certainly weren’t as fortified. And, of course, once they’re there, they can now establish their airfields and then they can hop to the next less heavily-defended islands.

And you’ll find that in an organization like LegacyCo, if you’re working in any kind of an organization or multiple teams, there will be people that are natural allies for you as you’re looking to move ahead, you’re looking to move to that next generation of infrastructure and operational engineering.

You’ll find that there are people who are really gung ho about putting together a POC and about using something like serverless to solve a problem, a challenge that they see.

Seek those people out and work with them. That’s the less defended island there.

But that’s only half the story, right?

If you’re MacArthur and Nimitz, once you’re on this less heavily-defended island, you can’t forget about the fortification behind you.

You’ve got to neutralize it. You’ve got to blockade them. You’ve got to cut off their supply lines until eventually, they surrender.

That is a fantastically aggressive metaphor for anything you should be doing in your organization. I’m not advocating that you blockade any cubicles or anything like that.

What I am saying, what I am saying is you can’t forget about the people that have trouble getting on board.

You’ve got to find ways to incentivize them to come along with you if this is really the direction that the org should be going.

And one of the ways you can do that, of course, is just to demonstrate value.

We’ll talk more about that in just a second.

How to apply the island-hopping analogy to LegacyCo

So if you’re LegacyCo, and you’re at the beginning of that chain of islands, and you want to work your way up to the goal, which is serverless adoption, you know, getting off of your legacy infrastructure, how do you even accomplish that?

Are you going to turn the page and be all serverless overnight?

Of course, you’re not.

That’s simply not going to happen. Realistically, it’s not going to happen.

What can happen is use of what my friend Ben Kehoe calls the serverless ladder.

Ben works for iRobot, the makers of the Roomba. They’re a big serverless shop.

He talks about the serverless ladder in a way that I like, and so I’ve visualized it like this.

The idea here is you’ve got, of course, the data center at one end where you’re owning everything, you’re owning all the infrastructure, things that maybe aren’t that core to your business.

And it’s going to take you a series of steps to get out of there.

You may recognize some of these as the Rs of migration if you’ve ever seen that.

Rehosting is one of them, lifting and shifting, where you’re just moving to the cloud but you’re not changing much about your architecture.

That’s pretty low on the serverless ladder, but it may be a necessary step.

You may need to do that before you can get to re-platforming where you’re moving to containers perhaps, you’re moving to more services like maybe RDS instead of running your own SQL server, and that’s a valid step as well.

I’m not denigrating folks who think that’s the way to go because it may be right for your organization, but I’m trying to say, “Don’t think that’s the top of the ladder.”

There are more rungs to climb if you’re going to abstract way more things that aren’t as high value for you until you get to that real re-architecting and refactoring place, which is harder, but it’s necessary to get to more of a cloud-native, serverless mindset.

The case for serverless

So how do you form that new tribe?

How do you build this tribe over time?

Just to recap some of the things we’ve said.

The long arc here is toward value, toward higher-order systems.

If you’re not doing it, your competitors certainly are.

So you want to try to take your engineers and move them more toward being able to focus on things that provide value for you.

You want to work with allies to advance the organization wherever you can to make that happen.

You know instinctively where those people are.

There’s things you can do today, whether you’re a manager, whether you’re an individual contributor to start making that happen.

And then creating incentives to enable stragglers.

One thing that I’ve seen be effective here, by the way, if you are a manager of a team that is having some trouble, see if you can create, kind of, a cultural-friendly competition around things like cloud certifications.

You’d be surprised how often that works.

Make a big deal about people who are moving forward, people who are leveling up their skills, people who have built something that you want to encourage, and then you’ll find others will fill in over time.

But ultimately, as I said, the demonstrated value there is going to be your best means of persuasion.

There’s an organization we’ve been working with at Trek10 for a number of years now.

They’re a large multinational corporation, and we came in there a few years ago to build what I’m fairly sure was the first serverless project at that org.

Serverless was very new at that time, and it was successful.

I think it was good for the careers of some people inside the company and so, over time, we’ve done a lot more projects with them.

But what I noticed there now that’s very interesting to me.

I can be on the phone with someone, I don’t know, it could be a team overseas that I’ve never met before.

I don’t have any direct influence over there, but we’re talking through an architecture, and they’ll say something like, “Oh, yeah, we were just going to use Azure functions for that.”

Or, “We felt DynamoDB was the place we needed to put that data.”

That’s, again, not coming from me, that’s osmosis seeping through the organization over a period, not of months, but of years to get to that point.

So I’m not saying it happens overnight, but it can happen.

I’ve seen it happen, and it does start with you finding what’s important to you and looking for ways to advance that wherever you can.

The Builder’s Creed: a shorthand for migrations

I want to leave you today with the Builder’s Creed, which is what I call something that my colleague, Jared Short, is known for saying, and it goes like this:

If the platform has it, if you’re building on something like AWS and they offer a service that does what you need to do, don’t reinvent that wheel.

If the market has it, if you have the choice between building your own logging and metrics engine or using something like Datadog, you should definitely buy that.

If you can reconsider your requirements to where you don’t need to build that janky solution, then, well, you should do that.

But, if you have to build it, if what you are building is central to your business and it’s the thing that differentiates you, you better own that.

You better make sure that your engineers are freed up to work on that.

You better make sure that’s your focus, that you understand how it works inside and out, that you’re monitoring it appropriately, that it’s architected appropriately, and that’s where you’ll see the value come from a service migration.

And hopefully, that will translate to success.”

Thank you very much. I do appreciate your time today.

Really quickly, I do have DMs open on Twitter.

I’m going to put some resources out directly following this that are going to support some of the things we’ve been talking about, so, hopefully, that will help you.

It’ll be more information about some of those architecture patterns.

Please follow me if you want to get that.

You can also email me any time.

Thank you very much, I appreciate your time. Come talk to me, get donuts afterwards.