Infrastructure is Architecture (with CDK)

Payam Moghaddam
Build Diligent
Published in
10 min readSep 1, 2022

--

Do these statements sound familiar to you?

I want to write application code, not infrastructure code

I’m not an infrastructure developer, I’m more of an architect

You’ve probably heard a variant of it. You may have even said it yourself! The problem is that these statements increasingly don’t make sense. When you are building Cloud-native applications, is your architecture truly distinct from your infrastructure? Is SQS considered architecture or infrastructure? Is configuring your API Gateway architecture or infrastructure?

Many people tend to rely on this coarse distinction:

Whatever is Terraform or CloudFormation is infrastructure, while our (run-time) code in TypeScript/Go/Ruby/etc. tends to define our architecture

So what happens when AWS Cloud Development Kit (CDK) uses the same language as your run-time language to define infrastructure (i.e. Go, TypeScript, Python, C#, etc.)? When you can define both your SQS, as well as the logic to process events from that SQS, where is the separation of infrastructure and architecture?

In the example above, where does the infrastructure start vs. the architecture? The reason it is difficult to answer such questions is because increasingly your infrastructure is your architecture. Especially when you take Serverless building blocks and define your application using them, your infrastructure defines your architecture.

Not convinced? Well, this post will make the case for how your infrastructure is increasingly your architecture. And when you embrace this change, it’ll let you build scalable, secure, and cost-effective solutions, fast 🚀.

Infrastructure Code is Prevalent

Below are two code snippets, one from Express.js, and one from CDK (same example as above). Which one of these is infrastructure code vs. application code?

Side-by-side Express.js vs. CDK code comparison

I’d argue they are both primarily infrastructure code. They both are simply doing the required “infrastructure” work of routing requests to your actual business logic (i.e. console.log). The difference is that Express.js’s infrastructure runs in the same process as your business logic, while AWS’s infrastructure runs separate to your business logic. However, this AWS CDK implementation already has tracing, metrics, centralized logging, auto-scaling, zero run-time packages to patch, and the entire infrastructure definition to deploy it (i.e. no Dockerfile, Helm, etc. required)!

This is a natural consequence of using AWS as your Framework. The more you leverage your Cloud provider, the less you need the capabilities of your run-time frameworks!

You can replace Express.js with e.g. Rails, and it will not be a significantly different comparison.

The challenge in seeing infrastructure as architecture tends to be mindset. Infrastructure classically has been distinct from application runtime, often requiring separate tooling and processes (e.g. Terraform, Ansible, Kubernetes, HAProxy, etc.). All of that was required to run our applications. Furthermore, our education systems emphasize how to structure our runtime and completely dismisses infrastructure. So naturally, we got used to seeing our architecture modelled in runtime processes and stopped questioning why we need these capabilities to be in our runtime rather than offered as a service or even modelled at build time. So let’s ask those questions now:

  1. Why does routing need to be part of your run-time process?
  2. Why does authorization need to be part of your run-time process?
  3. Why do module loaders need to be part of your run-time process?
  4. Why does logging configuration need to be part of your run-time process?
  5. Why does configuration loading need to be part of your run-time process?
  6. Why does input validation have to be part of your run-time process?

You get the point.

Still, let’s ask a more fundamental question:

  • What needs to be build-time vs. run-time?

This question is a bit more perplexing, so let me elaborate. As you start developing in a Serverless manner, your architecture can be increasingly “modelled” at build-time. For example, rather than needing your routing logic be managed in Express.js routes, you can define your routing logic at build-time and configure API Gateway to do that for you.

Why manage these concerns in your run-time? Let your Cloud provider take care of it. They already do it for thousands of other customers, so what makes you think you’d do it better? Instead, use your energy to focus on your core business. Delegate as much as possible to your Cloud! Use AWS as your framework!

AWS has been pushing this idea for quite some time, but it has been difficult to adopt due to the distinct tooling difference for build-time configuration vs. run-time code. Thus it has been natural to continue the classical mindset that our architecture is defined by our run-time rather than seeing the opportunity that we can model a significant portion of our architecture via infrastructure instead.

Fortunately, as Infrastructure-as-Code (IaC) has progressed, the latest generation of IaC tooling uses standard run-time programming languages (e.g. TypeScript, Go, C#, Python) to model your infrastructure, thus it removes this artificial language barrier between infrastructure and architecture. And you’d be surprised how much of your architecture simply comes down to how you setup your infrastructure! The best example of this can be seen with AWS Cloud Development Kit (CDK).

CDK — Removing the Artificial Barrier

If you look closely at my points above, there is nothing new in it. These ideas have existed for quite some time and are being leveraged by people who tightly integrate infrastructure modelling (i.e. CloudFormation/Terraform) and run-time execution (i.e. TypeScript/Go/Python/etc.). The problem has been the inconvenience (and inaccessibility) of mastering both. You’d typically hit two obstacles:

  • Application Developers: “Ew, why am I coding in YAML? This is so much easier in TypeScript”
  • Infrastructure Developers: “TypeScript? Why am I learning this? I need to learn the intricacies of CloudFormation’s YAML”
  • Variant: “I come from a networking background. I’m not a developer. TypeScript seems excessive to me”

Thus, everyone stayed in their own lane and created different solutions for the same problem. Fortunately, CDK changes that.

AWS Cloud Development Kit

CDK unifies the way we code our run-time logic, as well as configure our run-time environment at build-time. It allows us to use a single language to do true full stack development! In many ways, CDK is how IaC should have always been, it just took the infrastructure community a while to build the appropriate abstractions to get there.

Cloud Abstraction Layers

In the past decade, Cloud providers have crossed key milestones:

  1. They enabled access to utility-based infrastructure (e.g. AWS, Azure)
  2. They enabled programmatic access to this infrastructure (e.g. CLI, SDK)
  3. They enabled resource management of their infrastructure (e.g. CloudFormation, Terraform)
  4. The community explored simpler Domain Specific Languages (DSL) for this management (e.g. Serverless Framework, SAM)
  5. Now, Cloud providers (particularly AWS) are providing a modern programming language interface for both resource management and “architecture management” of their infrastructure (e.g. CDK and its associated Constructs)

This latest iteration builds on all former abstraction layers. Each layer of abstraction has been necessary for us to get here. Now we are finally making it easy to model your architecture via your infrastructure rather than via run-time logic.

And this has a massive productivity upside for us.

CDK’s Value Proposition

The greatest misunderstanding of CDK is that its greatest value proposition is the use of a modern imperative programming languages to define your infrastructure. Yes, this is valuable, and as stated above, it helps people’s mindset get past the artificial barrier that a language can impose. However, its greatest value proposition is in providing “architecture” building blocks built out of underlying infrastructure components (i.e. L2 and L3 constructs), without compromising on modern development practices that allowed us to develop our architectures cleanly (e.g. types, Object-Oriented programming, Aspect-oriented programming, testing). And to boot, it provides a CLI that makes it both easy to develop and deploy.

CDK deserves its own dedicated post, so I will only provide a simple example to demonstrate how CDK blends architecture and infrastructure together. We’ll start simple by building a static website. For a static website, you’d typically want a CDN, a static storage solution, and a deployment mechanism.

Done. That’s the entire code required for your static website. Can you differentiate your architecture from infrastructure?

Ok, you may say that’s too simple, so let’s build a 3-tier application serving APIs. Here the code for it:

Done. Again, that this is the entire code required!

There is no Kubernetes you need to deploy to, or ELK for observability, or Hashicorp Vault for secret management. The above code that defines your architecture includes the required infrastructure. You can start this simply and evolve your architecture with your product.

There is a ton to CDK but I hope you are starting to see that the artificial barriers we had before are removed. It lets you seamlessly use your Cloud as much as possible, and integrates run-time logic as necessary. This lets us get the most out of our Cloud provider, and lets us focus more on our business.

If you think about it, in a way, we are essentially starting to “compile” for the Cloud.

“Compiling” for the Cloud

As an analogy, I want you to go back in time to when Require.js was used to facilitate loading JS files in your browser. SPA development was in its infancy and thus, naturally, the run-time solution of Require.js emerged to load and manage your JS dependencies in the browser. Unfortunately, since it was a run-time solution, what you got were:

  1. A ton of requests to load JS files
  2. Unnecessary bloat since certain JS files would load even if not required
  3. Run-time processing for both loading Require.js and executing its logic

All of that just to deliver your value proposition: your SPA.

Webpack changed this though. Webpack essentially asked:

Why are we doing this at run-time? Why don’t we just do this at build-time?

What if we compiled what you needed only, so the browser executed only what the SPA needed to execute?

With no surprise, Webpack won, Require.js lost. It won because it was simpler. Why have all this run-time infrastructure just to facilitate executing your value proposition? Avoid the bloat and just give the browser what it needs: your business logic!

I’d venture to say that’s exactly what CDK does for you and it’ll have the same winning outcome. CDK “compiles” your code into the most efficient and secure model to execute on the Cloud. It makes it easy to delegate as much non-business domain logic to your Cloud provider, including scaling, routing, logging, metrics, etc. so you can focus on the logic that only you and your business will write.

For example, want a Slack bot? Sure! Here is the entire code, including infrastructure, for a Slack bot that’ll greet you when you 👋 at it.

Majority of the code is implementation details of `@slack/bolt` 😅

Isn’t this simpler? Isn’t this more in tune with what the business wants from engineering? Not only simpler though, but this model of development also enables much greater team autonomy. The architecture a team wants can be defined and modelled by that team and deployed by that team into their own team service accounts! The team can truly E2E manage their architecture and their infrastructure, since their infrastructure is now simply a by-product of that architecture.

That’s because your infrastructure is your architecture! 😁

Evolution of Cloud Development

This approach to Cloud-native development is still young and CDK itself has ways to go in its evolution. Even in the examples above, you can spot some clunky syntax or missing constructs. I see these though as growing pains and are generally quite simple to overcome. For example, you can look at SST framework and their simpler constructs.

CDK’s simplicity and its ability to let us focus on architecture more than ever is why we’ve adopted it at Diligent. We give every team their own AWS accounts, teach CDK, and demonstrate how Cloud is increasingly a framework and how they can extend their laptops with their team accounts. Especially when using the latest cdk watch capability, your local changes are watched and hotswapped into AWS so you can see your changes quickly on real AWS. No need to question "Will it run on AWS?" when you are running it on AWS the whole time! It allows everyone to regularly engage in architecture development and thus multiplies their business value and career opportunities. It's no surprise that CDK is being adopted by Terraform (CDKTF) and Kubernetes (cdk8s) to enable this level of simplicity.

For Diligent, it’s an obvious choice as our HighBond platform has been all-in-on AWS for a long time. Others may be more skeptical. They may question cdk watch's speed, or challenge the (still forming) development experience, or may struggle to see how they can bring it into an organization that has significant investment of just using Cloud as their infrastructure. Personally though, I'll always bet on simplicity. Simpler solutions will win out in the long-run.

In fact, I just ask myself a simple question: if I started a SaaS company today, what tech stack would I choose? Would I pick one that lets me focus on my core domain and architecture on day-one, or would I pick one that requires me to setup a lot of plumbing before I can get started?

The choice is pretty clear.

If the article above resonates with you and you’d like to be in an organization that focuses on business outcomes and strives to be as Serverless as possible, reach out to us at Diligent or reach out to me directly!

--

--