Terraform Smarter, Not Harder: The Power of Modular Infrastructure as Code

Terraform Smarter, Not Harder: The Power of Modular Infrastructure as Code

Future-proof cloud automation: Build scalable, modular infrastructure with Terraform

Managing cloud infrastructure can quickly become a tangled mess of dependencies, duplicate code, and fragile configurations. What starts as a simple Terraform script often bloats into an unmanageable monolith—difficult to update, even harder to troubleshoot, and nearly impossible to scale.

Why Monolithic Terraform Fails at Scale

A monolithic Terraform setup might work for a small project, but in a real-world cloud environment, it creates scalability and collaboration challenges:

  • Hard to maintain – Every change impacts too many components.

  • Risky deployments – No separation of concerns means one failure can bring down the entire infrastructure.

  • Slow development cycles – Teams can’t work in parallel without stepping on each other’s toes.

The Solution? Modularizing Infrastructure

Instead of managing everything in a single Terraform configuration, modularization breaks infrastructure into small, reusable components that can be independently:

  • Managed

  • Versioned

  • Deployed

This approach allows us to scale efficiently, reduce risk, and improve collaboration—all while keeping infrastructure clean and maintainable.

In This Post…

We’ll explore how to structure a modular Terraform architecture, covering Networking, Compute (EKS), and Storage. We’ll walk through a real-world example, complete with Terraform code and architectural diagrams, so you can start applying these principles today.

📌 Topics

📪 Setting the Stage
🗃️ Organizing Core Components
🛠️ Setting Up the Terraform Code
🔖 Tagging Module
🛜 Network Module
💿 Compute Module
📦 Storage Module
🌲 Environment Specific Configurations
🎬 Root Files
🚀 Wrapping Up


📪 Setting the Stage

Picture this: your DevOps team lead hands you the following architecture diagram and says,

“We need to build a proof of concept (POC) for this infrastructure. If it works, we’ll scale it across the enterprise and make it the standard.”

Taking a quick look, it’s tempting to write some Terraform code in a few files and deploy it. And technically, it would work—traffic would flow into the EKS cluster, and the cluster would successfully execute read/write actions against the S3 bucket. Job complete, right?

Not quite.

Now comes the real challengescalability.

  • How do you roll this out across the enterprise for multiple teams and environments?

  • How do you handle change management as infrastructure grows and evolves with each team’s unique needs and innovations?

Enter modularization…

🟢 Bite Size Pieces – Identifying Resource Groupings:

When you first look at an architecture diagram, it can feel overwhelming—multiple services, interconnected dependencies, and a mix of networking, compute, and storage layers. But before jumping into writing Terraform code, it’s critical to break down the infrastructure into logical, modular components.

A well-structured architecture can usually be decomposed into three core layers:

  1. Networking (Foundation Layer)

What to look for:

  • VPCs, subnets (public/private), route tables

  • Internet gateways, NAT gateways, VPN, Private Link

Why it should be modular:
Networking acts as the backbone of infrastructure—it defines how resources communicate securely and efficiently. By keeping it separate, multiple environments (dev, staging, prod) can reuse the same foundational networking setup.

  1. Compute (Application Layer)

What to look for:

  • Compute resources like EKS clusters, EC2 instances, or Lambda functions

  • Ingress components (Application Load Balancer, Network Load Balancer)

  • Scaling components like Auto Scaling Groups

  • IAM roles tied to compute resources

Why it should be modular:
The compute layer runs applications and often changes independently of networking or storage. Keeping compute separate allows teams to scale clusters, update workloads, or migrate compute resources without affecting the entire infrastructure.

  1. Storage (Data Layer)

What to look for:

  • S3 buckets, EBS volumes, databases (RDS, DynamoDB)

  • Storage policies (encryption, backups, access control)

Why it should be modular:
Storage needs long-term data persistence and often has strict access policies. Separating storage ensures consistent data management, security enforcement, and independent scaling.

🟢 Applying This to Any Diagram

While Networking, Compute, and Storage are common categories, some architectures may require additional layers, such as:

  • Security Layer – IAM, policies, WAF, firewall rules

  • Observability Layer – Logging, monitoring, alerting (CloudWatch, Prometheus, ELK)

  • Data & Analytics Layer – Data lakes, ETL pipelines, machine learning services

Regardless of the complexity, the key is to group resources based on their function and lifecycle. If a resource:

  • Has a shared lifecycle → It belongs in the same module.

  • Scales or changes independently → It should be a separate module.

  • Has different security/access rules → It may need isolation.

Why This Matters for Modular Terraform

By analyzing an architecture diagram before writing any Terraform, you ensure:

  • A clear separation of concerns (each module handles a specific function).

  • Easier scalability and maintainability (updates don’t break everything).

  • A structured approach to infrastructure-as-code (no tangled dependencies).

In the next section, we’ll apply this modular breakdown to our Terraform implementation—starting with Networking, Compute (EKS), and Storage.


🗃️ Breaking It Down: Organizing Core Components into Modules

Now that we’ve analyzed the architecture diagram, let’s break it down into logical infrastructure components and group them into Terraform modules.

  1. Networking: Establishing the Foundation

At the network layer, we have a single VPC within the us-east-2 region, containing:

  • Internet Gateway – Enables public internet access.

  • Two Subnets – One public, one private.

  • Route Tables – A public and a private route table for directing traffic.

  • NAT Gateway – Allows private subnet resources to access the internet.

Since these resources define how traffic flows across our cloud environment, we’ll group them into a network module. This keeps the networking layer isolated from compute and storage, allowing us to reuse it across multiple environments without modification.

  1. Compute: Managing Application Workloads

With the networking foundation in place, let’s examine the compute layer. The key resources here are:

  • EKS Cluster & Node Group – The core of our containerized workload.

  • Security Groups – Controls traffic flow into and out of the cluster.

  • Load Balancers – An NLB (public-facing) and an ALB (internal) to distribute traffic.

At first glance, security groups and load balancers might seem like networking components. However, if we consider their lifecycle, it becomes clear that they are tightly coupled with the EKS cluster.

  • The ALB and NLB exist specifically to route traffic to workloads running on EKS.

  • The security group is scoped to the cluster and nodes, not the entire network.

  • If we were to destroy the compute layer, these resources would be orphaned and need to be destroyed as well.

Thus, these components logically belong inside the compute module, rather than network.

  1. Storage: Managing Data Persistence

For now, our storage needs are simple—a single S3 bucket for storing artifacts or logs.

However, at enterprise scale, storage will likely evolve to include:

  • FSx (Managed file systems)

  • RDS (Relational databases)

  • DynamoDB (NoSQL data storage)

By modularizing storage, we ensure that as new storage services are needed, they can be easily integrated into the existing infrastructure without major code rewrites.

🟢 Why This Modular Approach Works

By grouping infrastructure based on function and lifecycle, we:

  • Separate concerns – Networking, compute, and storage evolve independently.

  • Improve scalability – Teams can modify compute or storage without breaking the network.

  • Simplify maintenance – Updating a module won’t disrupt unrelated components.

With this structure in place, we can now dive into the Terraform implementation.


🛠️ Setting Up the Terraform Code – What We’re Building

Now that we've established why modularization matters, let’s look at how to structure our Terraform project to make it scalable, maintainable, and reusable.

For this example, we’ll use a single repository setup, which is ideal for:

  • Keeping things simple while demonstrating modular principles.

  • Faster iteration—all modules and environments live in one place.

  • Smaller teams or proof-of-concept (POC) projects before scaling to multiple repositories.

Note: For enterprise-level implementations, it's best to separate modules and environments into multiple repositories. This ensures better versioning, access control, and change management—a topic we’ll cover in a future blog post.

🟢 Repository Structure

To keep our Terraform clean, reusable, and scalable, we’ll organize it as follows:

/poc
  ├── modules                   # Reusable infrastructure modules
     ├── network               # Networking (VPC, subnets, routes, etc.)
     ├── compute               # EKS cluster, node groups, related resources
     ├── storage               # S3 bucket, future storage needs
     ├── tagging               # Standardized resource tagging
  ├── tfvars                    # Environment-specific configurations
     ├── dev-us-east-2.tfvars  # Dev environment
     ├── prod-us-east-2.tfvars # Prod environment
  ├── locals.tf                 # Logic for grouping variables for module consumption
  ├── main.tf                   # Root file for initializing infrastructure
  ├── outputs.tf                # Infrastructure component outputs
  ├── provider.tf               # Configures provider settings to be used throughout the configuration
  ├── terraform.tf              # Configures terraform behavior, required providers, and backend configurations
  ├── variables.tf              # Global variables shared across environments

🟢 Breaking Down the Repository Structure

modules/ – Reusable Infrastructure Components

This directory contains self-contained modules that manage specific pieces of infrastructure:

  • network/ → VPC, subnets, route tables, and NAT gateway.

  • compute/ → EKS cluster, node groups, and load balancers.

  • storage/ → S3 bucket and access policies.

  • tagging/ → Resource tags

Why it matters:

  • Avoids code duplication and makes scaling easier.

  • Allows us to reuse these modules across different environments.


tfvars/ – Environment-Specific Configurations

The tfvars/ directory contains Terraform variable configurations per environment.

Example: tfvars/dev-us-east-2.tfvars

  • Defines dev-specific variables (e.g., smaller instance sizes, fewer nodes).

  • Uses the same modules as prod, but with different configurations.

Example: tfvars/prod-us-east-2.tfvars

  • Uses larger instances and more scalable infrastructure.

  • References the same modules but with production-grade settings.

Why it matters:

  • Allows us to deploy environments independently without rewriting Terraform code.

  • Supports scaling Terraform deployments efficiently.


Root Files (main.tf, variables.tf, outputs.tf)

These files act as Terraform's entry point and manage how infrastructure is provisioned:

  • main.tf → Calls modules and orchestrates infrastructure.

  • variables.tf → Defines shared input variables across modules.

  • outputs.tf → Exposes key values (e.g., VPC ID, EKS endpoint, S3 bucket name).

Why it matters:

  • Keeps the repository structure clean and maintainable.

  • Makes it easy to reference infrastructure components across Terraform configurations.


🟢 What’s Next?

Now that we’ve established a clear repository structure, it’s time to start implementing Terraform modules—beginning with Tagging.


🔖 Setting Up the Tagging Module

What We’re Building

Tagging is often overlooked in Terraform configurations, but it plays a crucial role in managing cloud infrastructure effectively. Without a standardized tagging strategy, tracking resources, managing costs, and ensuring security compliance becomes increasingly difficult as environments scale. By centralizing tag management, we ensure:

  • Resource Organization – Easier tracking of resources across environments.

  • Cost Allocation – Tags help break down costs by project, team, or application.

  • Compliance & Governance – Enforces tagging policies for security and auditing.

Key Components in This Module

  • Tagging Standards – Define key-value pairs for all infrastructure components.

  • Consistent Enforcement – Ensure every resource includes predefined tags.

  • Flexible Inputs – Allow different environments to override or extend default tags.

🟢 Module Structure

/tagging
  ├── outputs.tf   # Outputs a map of tags
  ├── tags.tf      # Creates a map of key, value pairs
  ├── terraform.tf # Configures terraform behavior and required providers
  ├── variables.tf # Defines the module inputs

🟢 Code Walkthrough

Now, let’s progressively introduce the Terraform code, showing key inputs, logic, and outputs.

Inputs (variables.tf)

This file defines what inputs the module expects:

variable "contact" {
    description = "The resource owner's email address."
    type        = string
}

variable "environment" {
    description = "The environment associated with the resource. Typically Dev, QA, Prod, etc."
    type        = string
}

variable "project" {
    description = "The project associated with the resource."
    type        = string
}

variable "terraform_managed" {
    description = "Is the resource managed by terraform."
    type        = bool
    default     = true
}

variable "source_gitlab_project" {
    description = "The source structure repository hosting the terraform deploying this resource."
    type        = string
}

These variables allow flexibility while ensuring core tagging standards are met. To ensure proper tagging policies you will likely need to confer with your organization’s enterprise architecture team.

Tagging Logic (tags.tf)

This file creates a map of tags that can be applied across resources:

locals {
    tags = tomap({
        Contact             = var.contact
        Environment         = var.environment
        Project             = var.project
        TerraformManaged    = var.terraform_managed
        SourceGitLabProject = var.source_gitlab_project
    })   
}

Note: You’ll notice that there is not a Name tag included within this map. The resource names are defined within each module and will be merged with the tag map at that layer along with any module specific custom tags.

Outputs (outputs.tf)

This file outputs a tags map that other modules can reference:

output “tags” {
    description = “A map of tags created by this module.”
    value       = local.tags
}

Defining Terraform Behavior (terraform.tf)

This file defines Terraform’s behavior within the module, the cli version, and the required providers:

terraform {
    required_providers {
        aws = {
            source  = "hashicorp/aws"
            version = "~> 5.0"
        }
    }

    required_version = ">= 1.10.5"
}

🟢 How Data Flows Through the Module

  1. Users define environment, project, and contact variables.

  2. The module merges the key value pairs together into a map of tags.

  3. Outputs a standardized tags map.

  4. Other modules reference the tagging module for consistency.

🟢 What’s Next?

Now that we’ve built and structured the tagging module, we’ll:

  • Move on to setting up the network module.

  • Integrating the tagging module into the network module.

  • Add custom tags to the existing map of tags.


🛜 Setting Up the Network Module

What We’re Building

Cloud infrastructure is only as strong as its networking layer. A poorly structured network can lead to security risks, inefficient traffic flow, and connectivity bottlenecks.

The networking module lays the foundation for secure and scalable communication between Compute (EKS), Storage (S3), and other cloud resources. It ensures public-facing services remain accessible while keeping internal resources isolated and protected.

Key Components in This Module

  • VPC – Defines the private network scope and acts as the foundation for cloud networking.

  • Subnets – Divides traffic between public (external access) and private (internal services).

  • Route Tables – Directs traffic flow between subnets and external networks.

  • Internet Gateway (IGW) – Enables public internet access for public subnets.

  • NAT Gateway – Allows private subnets to access the internet securely without exposing resources.

🟢 Module Structure

/network
  ├── data.tf         # Queries current region data
  ├── locals.tf       # Additional logic local to this module
  ├── nat_gateway.tf  # Creates NAT gateway and associated eip
  ├── outputs.tf      # Outputs vpc and subnet ids
  ├── route_tables.tf # Creates and associates route tables
  ├── subnets.tf      # Creates the subnets
  ├── terraform.tf    # Configures terraform behavior and required providers
  ├── variables.tf    # Defines the module inputs
  ├── vpc.tf          # Creates the vpc and internet gateway

🟢 Terraform Code Walkthrough

Now, let’s progressively introduce the Terraform code, showing key inputs, resources, and outputs.

Inputs (variables.tf)

This file defines what inputs the module expects:

variable "vpc_cidr" {
    description = "CIDR block for the VPC"
    type        = string
}

variable "tags" {
    description = "A map of tags passed in from the tagging module. Typing is enforced at the tagging module layer."
    type        = any
}

Inputs (locals.tf)

This file is where additional logic local to this module is defined. In this case, we are pulling the value of the Project tag and saving it as a locals variable for use in naming our resources. This allows us to avoid hard coding the value or querying the tags map multiple times:

locals {
    # The project name used in resource naming
    project_name = var.tags.Project
}

Inputs (data.tf)

This file uses a data block to query the current region avoiding hard coding the value or requiring users to manually define it:

data "aws_region" "current" {}

In this module, we define the input data across three distinct files. This provides a separation of duties while allowing us to require minimal user input. With these values, we have everything we need to construct the networking resources.

VPC Resources (vpc.tf)

This file creates the VPC and Internet Gateway resources:

resource "aws_vpc" "vpc" {
    cidr_block = var.vpc_cidr

    instance_tenancy = "default"

    enable_dns_support   = true
    enable_dns_hostnames = true

    tags = merge(
        var.tags,
        {
            Name = "${local.project_name}Vpc"
        }
    )
}

resource "aws_internet_gateway" "igw" {
    vpc_id = aws_vpc.vpc.id

    tags = merge(
        var.tags,
        {
            Name = "${local.project_name}Igw"
        }
    )
}

The VPC serves as the foundation of the network—every other resource depends on it. The Internet Gateway operates as the front door, so to speak, between our VPC internal traffic and the outside world. When configuring these resources, note how we make use of the Terraform merge() function to add our custom tags to the existing tags map passed in from the tagging module. Additionally, you can see how we make use of the local.project_name variable created above.

Define Subnets (subnets.tf)

This file creates the public and private subnets:

resource "aws_subnet" "public" {
    vpc_id            = aws_vpc.vpc.id
    availability_zone = "${data.aws_region.current.name}a"

    cidr_block = cidrsubnet(aws_vpc.vpc.cidr_block, 2, 0)

    tags = merge(
        var.tags,
        {
            Name = "${local.project_name}PublicSubnet"
        }
    )
}

resource "aws_subnet" "private" {
    vpc_id            = aws_vpc.vpc.id
    availability_zone = "${data.aws_region.current.name}a"

    cidr_block = cidrsubnet(aws_vpc.vpc.cidr_block, 2, 1)

    tags = merge(
        var.tags,
        {
            Name = "${local.project_name}PrivateSubnet"
        }
    )
}

All of the required inputs for these subnets can be derived from our existing inputs and the VPC we created above. Note the use of the Terraform cidrsubnet() function to calculate each subnet’s cidr range. This allows us to generate this value through automation, reducing the need for manual inputs. You can also see how the availability zone is calculated. This may seem trivial at the POC stage, but when scaling to an enterprise level with multiple regions and availability zones, dynamically populating this value will be much more efficient.

Route Tables & Associations (route_tables.tf)

This file creates and associates the public and private route tables:

# Public Route Table
resource "aws_route_table" "public" {
    vpc_id = aws_vpc.vpc.id

    route {
        cidr_block = "0.0.0.0/0"
        gateway_id = aws_internet_gateway.igw.id
    }

    tags = merge(
        var.tags,
        {
            Name = "${local.project_name}PublicRT"
        }
    )
}

resource "aws_route_table_association" "public" {
    subnet_id      = aws_subnet.public.id
    route_table_id = aws_route_table.public.id
}

# Private Route Table
resource "aws_route_table" "private" {
    vpc_id = aws_vpc.vpc.id

    route {
        cidr_block     = "0.0.0.0/0"
        nat_gateway_id = aws_nat_gateway.nat.id
    }

    tags = merge(
        var.tags,
        {
            Name = "${local.project_name}PrivateRT"
        }
    )
}

resource "aws_route_table_association" "private" {
    subnet_id      = aws_subnet.private.id
    route_table_id = aws_route_table.private.id
}

The public and private routes are defined and associated with their respective subnets.

NAT Gateway (nat_gateway.tf)

This file defines the NAT Gateway allowing traffic in the private subnet to access the internet:

resource "aws_eip" "nat" {
    tags = merge(
        var.tags,
        {
            Name = "${local.project_name}NatEip"
        }
    )
}

resource "aws_nat_gateway" "nat" {
    allocation_id = aws_eip.nat.id
    subnet_id     = aws_subnet.public.id

    tags = merge(
        var.tags,
        {
            Name = "${local.project_name}NatGateway"
        }
    )
}

This resource is needed for our compute resources (e.g., EKS) to access the internet securely.

Outputs (outputs.tf)

Exposes required values for other modules to consume:

output "vpc_id" {
    description = "The id for the vpc."
    value       = aws_vpc.vpc.id
}

output "subnet_id_public" {
    description = "The id for the public subnet."
    value       = aws_subnet.public.id
}

output "subnet_id_private" {
    description = "The id for the private subnet."
    value       = aws_subnet.private.id
}

These outputs allow other modules (e.g., Compute) to reference the networking infrastructure.

Defining Terraform Behavior (terraform.tf)

terraform {
   ...
}

This file defines Terraform’s behavior within the module, including the required CLI version and providers. In more complex architectures, its contents may vary between modules. However, since this is a small POC, its configuration remains identical to the tagging module, so I won’t reproduce the code here.

🟢 How Data Flows Through the Module

The network module processes input variables and produces outputs that other infrastructure components rely on:

  1. User defines VPC CIDR block.

  2. Terraform creates a VPC and derives subnets dynamically using cidrsubnet().

  3. Route tables ensure proper traffic flow between subnets and the internet.

  4. Compute & storage modules consume vpc_id and subnet_ids from outputs.tf.

🟢 What’s Next?

Now that we’ve established our cloud network, it’s time to bring it to life. The Compute module (EKS) will use our private subnets, route tables, and network configurations to deploy a Kubernetes cluster that can securely interact with other cloud services.

In the next section, we’ll:

  • Provision an EKS Cluster with load balancers and autoscaling node groups.

  • Define IAM roles for secure cluster operation.

  • Integrate EKS with the network module outputs.

Stay tuned as we take modular Terraform to the next level with Kubernetes!


💿 Setting Up the Compute Module

What We’re Building

Kubernetes is powerful, but deploying it manually is complex. Managing worker nodes, IAM roles, and secure networking can quickly become overwhelming.

The Compute Module simplifies this by provisioning an EKS cluster, configuring node groups, and managing IAM permissions and load balancers—all in a modular, reusable way.

This ensures our cluster is scalable, secure, and easy to integrate with networking and storage.

The compute module provisions and manages the EKS (Elastic Kubernetes Service) cluster and its supporting resources. This module ensures:

  • Scalability – Allows Kubernetes workloads to scale dynamically.

  • Security – Defines IAM roles, policies, and networking for secure operations.

  • Flexibility – Supports different configurations for development, staging, and production environments.

Key Components in This Module

  • EKS Cluster – Deploys the Kubernetes control plane, which manages cluster operations.

  • Node Groups – Provisions worker nodes to run applications and scale workloads.

  • IAM Roles & Policies – Ensures secure API access and node communication.

  • Load Balancers:

    • NLB (Network Load Balancer) – Operates at Layer 4 (TCP) and routes external traffic to services in the public subnet.

    • ALB (Application Load Balancer) – Operates at Layer 7 (HTTP/S) and manages traffic for applications inside the private subnet.

Note: Kubernetes is an expansive and complex technology. In this post, we’ll focus on a minimal, streamlined configuration to demonstrate core concepts. A deeper dive into advanced Kubernetes configurations in the cloud will be covered in a future blog.

🟢 Module Structure

/compute
  ├── data.tf      # Defines IAM role trust policy documents
  ├── eks.tf       # Creates the EKS cluster and node groups
  ├── iam.tf       # Creates IAM roles and policies for cluster authentication and permissions
  ├── ingress.tf   # Creates cluster load balancers
  ├── outputs.tf   # Outputs eks endpoint and node group role arn
  ├── sg.tf        # Manages the cluster security group
  ├── terraform.tf # Configures terraform behavior and required providers
  ├── variables.tf # Defines the module inputs

🟢 Terraform Code Walkthrough

Now, let’s progressively introduce the Terraform code, showing key inputs, resources, and outputs.

Inputs (variables.tf)

This file defines what the module expects:

### Compute Variables ###
variable "eks_data" {
    description = "An object of EKS cluster and node group configurations."
    type = object({
        cluster_name = string

        node_group = object({
            instance_types = optional(list(string), "t3.micro")
            desired_size   = optional(number, 1)
            max_size       = optional(number, 1)
            min_size       = optional(number, 1)
        })
    })
}

### Network Variables ###
variable "vpc_id" {
    description = "The id for the vpc."
    type        = string
}

variable "subnet_id_public" {
    description = "The id for the public subnet."
    type        = string
}

variable "subnet_id_private" {
    description = "The id for the private subnet."
    type        = string
}

### Tags ###
variable "tags" {
    description = "A map of tags passed in from the tagging module. Typing is enforced at the tagging module layer."
    type        = any
}

For the compute module inputs, we use a complex variable type for the eks_data variable. EKS configurations can be intricate, with numerous attributes to manage. Instead of handling each attribute as a separate variable, we simplify the variable file by grouping them together, making it more maintainable. This approach will become clearer later in the blog when we define the module calls in the root-level files. Additionally, networking variables and tags are automatically passed through the module call, which will also be explained further in the root-level configuration.

Create the EKS Cluster and Node Group (eks.tf)

This file defines the EKS cluster and its node group:

resource "aws_eks_cluster" "eks" {
    name     = var.eks_data.cluster_name
    role_arn = aws_iam_role.eks_cluster_role.arn

    vpc_config {
        subnet_ids = [var.subnet_id_private]
    }

    tags = merge(
        var.tags,
        {
            Name = var.eks_data.cluster_name
        }
    )

    depends_on = [aws_iam_role_policy_attachment.eks_cluster_policy]
}

resource "aws_eks_node_group" "eks_nodes" {
    cluster_name    = aws_eks_cluster.eks.name
    node_role_arn   = aws_iam_role.eks_node_group_role.arn
    subnet_ids      = [var.subnet_id_private]
    instance_types  = var.eks_data.node_group.instance_types

    scaling_config {
        desired_size = var.eks_data.node_group.desired_size
        max_size     = var.eks_data.node_group.max_size
        min_size     = var.eks_data.node_group.min_size
    }

    tags = merge(
        var.tags,
        {
            Name = "${var.eks_data.cluster_name}-node-group"
        }
    )
}

This defines the EKS control plane, which orchestrates Kubernetes workloads. You'll notice it depends on a specific EKS Cluster Policy attachment, which we will create in another file. Worker nodes are provisioned within private subnets and scale dynamically based on demand.

IAM Roles for Cluster & Nodes (iam.tf)

This file defines the IAM Roles and Policies used for cluster authentication and permissions:

########################
### EKS Cluster Role ###
########################
# Role
resource "aws_iam_role" "eks_cluster_role" {
    name               = "${var.eks_data.cluster_name}-cluster-role"
    assume_role_policy = data.aws_iam_policy_document.eks_cluster_trust_policy.json

    tags = merge(
        var.tags,
        {
            Name = "${var.eks_data.cluster_name}-cluster-role"
        }
    )
}

# Attachments
resource "aws_iam_role_policy_attachment" "eks_cluster_policy" {
    role       = aws_iam_role.eks_cluster_role.name
    policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"
}

###########################
### EKS Node Group Role ###
###########################
# Role
resource "aws_iam_role" "eks_node_group_role" {
    name               = "${var.eks_data.cluster_name}-node-group-role"
    assume_role_policy = data.aws_iam_policy_document.eks_node_group_trust_policy.json

    tags = merge(
        var.tags,
        {
            Name = "${var.eks_data.cluster_name}-node-group-role"
        }
    )
}

# Attachments
resource "aws_iam_role_policy_attachment" "eks_worker_policy" {
    role       = aws_iam_role.eks_node_group_role.name
    policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
}

resource "aws_iam_role_policy_attachment" "eks_cni_policy" {
    role       = aws_iam_role.eks_node_group_role.name
    policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
}

resource "aws_iam_role_policy_attachment" "ec2_container_registry_policy" {
    role       = aws_iam_role.eks_node_group_role.name
    policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
}

For this small POC, we are using built-in AWS policies. However, in an enterprise deployment, you will likely need to create custom policies to ensure your cluster has the necessary permissions for AWS operations.

Trust Policies for the EKS Cluster and Node Group IAM Roles (data.tf)

This file defines the trust policies for the Cluster and Node Group IAM roles:

# EKS Cluster Role Trust Policy
data "aws_iam_policy_document" "eks_cluster_trust_policy" {
    version = "2012-10-17"

    statement {
        actions = ["sts:AssumeRole"]
        effect  = "Allow"
        principals {
            type        = "Service"
            identifiers = ["eks.amazonaws.com"]
        }
    }
}

# EKS Node Group Trust Policy
data "aws_iam_policy_document" "eks_node_group_trust_policy" {
    version = "2012-10-17"

    statement {
        actions = ["sts:AssumeRole"]
        effect  = "Allow"
        principals {
            type        = "Service"
            identifiers = ["ec2.amazonaws.com"]
        }
    }
}

These trust policies define which principals are authorized to assume the roles specified in the iam.tf file. You'll also notice that these resources use the aws_iam_policy_document data block. This is a personal preference, as I find it provides cleaner syntax and is easier to manage compared to inline JSON.

Load Balancers for Ingress Traffic (ingress.tf)

This file defines the ingress load balancers that direct and filter traffic to the cluster.

#############################
### Network Load Balancer ###
#############################
# NLB
resource "aws_lb" "nlb" {
    name               = "${var.eks_data.cluster_name}-nlb"
    internal           = false
    load_balancer_type = "network"
    subnets            = [var.subnet_id_public]
}

# Listener
resource "aws_lb_listener" "nlb_listener" {
    load_balancer_arn = aws_lb.nlb.arn
    port              = 80
    protocol          = "TCP"

    default_action {
        type             = "forward"
        target_group_arn = aws_lb_target_group.nlb_target.arn
    }
}

# Target Group
resource "aws_lb_target_group" "nlb_target" {
    name     = "${var.eks_data.cluster_name}-nlb-tg"
    port     = 80
    protocol = "TCP"
    vpc_id   = var.vpc_id
}

# Send traffic from NLB to ALB
resource "aws_lb_target_group_attachment" "nlb_to_alb" {
    target_group_arn = aws_lb_target_group.nlb_target.arn
    target_id        = aws_lb.alb.id  # NLB sends traffic to the ALB
    port            = 80
}


#################################
### Application Load Balancer ###
#################################
# ALB 
resource "aws_lb" "alb" {
    name               = "${var.eks_data.cluster_name}-alb"
    internal           = true
    load_balancer_type = "application"
    subnets            = [var.subnet_id_private]
}

# Listener
resource "aws_lb_listener" "alb_listener" {
    load_balancer_arn = aws_lb.alb.arn
    port              = 80
    protocol          = "HTTP"

    default_action {
        type             = "forward"
        target_group_arn = aws_lb_target_group.alb_target.arn
    }
}

# Target Group
resource "aws_lb_target_group" "alb_target" {
    name     = "${var.eks_data.cluster_name}-alb-tg"
    port     = 80
    protocol = "HTTP"
    vpc_id   = var.vpc_id
}

# Send traffic from ALB to EKS node group
resource "aws_lb_target_group_attachment" "alb_to_eks" {
    target_group_arn = aws_lb_target_group.alb_target.arn
    target_id        = aws_eks_node_group.eks_nodes.id  # ALB sends traffic to the node group
    port            = 80
}

The NLB (in the public subnet) and ALB (in the private subnet), along with their supporting resources, manage traffic flow to and from the EKS cluster. In this small POC, we are only utilizing port 80. However, in an enterprise deployment, you would also use port 443 and any additional custom ports required by your configuration.

Why This Matters?

  • NLB handles direct traffic from external sources (Layer 4 TCP).

  • ALB manages application-layer traffic (Layer 7 HTTP/S) within private subnets.

Security Group Rules (sg.tf)

This file adds a rule to the cluster security group for the above ingress port:

# Add security group rule to EKS Cluster SG
resource "aws_vpc_security_group_ingress_rule" "port_80" {
    security_group_id = aws_eks_cluster.eks.vpc_config[0].cluster_security_group_id
    description.      = "Ingress for internal IP space"

    cidr_ipv4   = "10.0.0.0/8"  # Org internal cidr range
    from_port   = 80
    ip_protocol = "tcp"
    to_port     = 80
}

By default, AWS automatically provisions an EKS security group for cluster communication. Here, we add a rule to that security group to allow ingress traffic on port 80 from only our Org’s internal IPs.

Outputs (outputs.tf)

Expose required values for other modules to consume:

output "eks_endpoint" {
    description = "The EKS cluster endpoint."
    value       = aws_eks_cluster.eks.endpoint
}

output "eks_node_group_role_arn" {
    description = "The EKS node group role arn."
    value       = aws_iam_role.eks_node_group_role.arn
}

These outputs allow other modules (e.g., Storage) to reference the compute resources.

Defining Terraform Behavior (terraform.tf)

terraform {
   ...
}

This file specifies the Terraform settings for this module, including the required CLI version and providers. While these configurations can differ across modules in more complex architectures, they remain unchanged from the tagging module for this small POC, so I won’t reproduce the code here.

Additionally, for this minimal EKS setup, we are not performing any internal cluster configurations. In an enterprise-level deployment, cluster bootstrapping would be necessary, requiring the Kubernetes provider to be included in this file. We will explore that in a future blog.

🟢 How Data Flows Through the Module

  1. Users define cluster name, and node group configurations.

  2. Outputs from the Network module are automatically passed into the compute module.

  3. The compute module provisions the EKS control plane and worker nodes.

  4. IAM roles ensure secure interactions between AWS and Kubernetes.

  5. Outputs provide references for the storage module.

🟢 What’s Next?

Now that we’ve built and structured the compute module, we’ll:

  • Move on to integrating the compute module with our storage module.

Stay tuned as we take modular Terraform to the next level!


📦 Setting Up the Storage Module

What We’re Building

Storage is a critical part of any cloud infrastructure. Whether storing application data, logs, or Terraform state, a well-structured storage module ensures:

  • Scalability – Storage expands as needed without requiring infrastructure changes.

  • Security – IAM policies enforce strict access controls to protect data.

  • Flexibility – Environment-specific configurations adapt storage for different use cases.

Key Components of This Module

  • S3 Bucket – Serves as the primary storage for application data and logs.

  • Bucket Policies – Defines fine-grained access controls to manage security and permissions.

🟢 Module Structure

/storage
  ├── data.tf      # Queries caller identity, defines bucket policy
  ├── outputs.tf   # Outputs bucket name
  ├── s3.tf        # Creates s3 bucket and supporting resources
  ├── terraform.tf # Configures terraform behavior and required providers
  ├── variables.tf # Defines the module inputs

🟢 Terraform Code Walkthrough

Now, let’s progressively introduce the Terraform code, showing key inputs, resources, and outputs.

Inputs (variables.tf)

This file defines what the module expects:

### S3 Bucket Configuration ###
variable "s3_bucket_name" {
    description = "The name of the s3 bucket."
    type        = string
}

### EKS Node Group Arn ###
variable "eks_node_group_role_arn" {
    description = "The EKS node group role arn."
    type        = string
}

### Tags ###
variable "tags" {
    description = "A map of tags passed in from the tagging module. Typing is enforced at the tagging module layer."
    type        = any
}

Inputs (data.tf)

This data block queries the current AWS account number and will be used to ensure the bucket name is globally unique:

data "aws_caller_identity" "current" {}

This module organizes input data across two distinct files, ensuring a clear separation of duties while minimizing required user input. For this POC, the inputs remain relatively simple. However, in an enterprise-level deployment, S3 bucket configurations can become significantly more complex. In such cases, using a complex variable type—similar to the approach in the compute module—provides a more scalable and maintainable solution than defining each attribute individually.

Storage Resources (s3.tf)

This file creates the S3 bucket and several supporting configurations:

# S3 Bucket
resource "aws_s3_bucket" "bucket" {
    bucket = "${data.aws_caller_identity.current.account_id}-${var.s3_bucket_name}"

    tags = merge(
        var.tags,
        {
            Name = "${data.aws_caller_identity.current.account_id}-${var.s3_bucket_name}"
        }
    )
}

# Server Side Encryption
resource "aws_s3_bucket_server_side_encryption_configuration" "config" {
    bucket = aws_s3_bucket.bucket.id

    rule {
        apply_server_side_encryption_by_default {
            sse_algorithm = "AES256"
        }
    }
}

# Block Public Access
resource "aws_s3_bucket_public_access_block" "access" {
    bucket = aws_s3_bucket.bucket.id

    block_public_acls       = true
    block_public_policy     = true
    ignore_public_acls      = true
    restrict_public_buckets = true
}

# Bucket ACL
resource "aws_s3_bucket_acl" "example" {
    bucket = aws_s3_bucket.bucket.id
    acl    = "private"
}

# Versioning
resource "aws_s3_bucket_versioning" "version" {
    bucket = aws_s3_bucket.bucket.id

    versioning_configuration {
        status = "Enabled"
    }
}

# Attach Bucket Policy
resource "aws_s3_bucket_policy" "policy" {
    bucket = aws_s3_bucket.bucket.id
    policy = data.aws_iam_policy_document.bucket_policy.json
}

These minimal bucket configurations provide storage for logs, artifacts, or application data for the EKS cluster. In an enterprise-level implementation, additional configurations would likely be required to meet security, compliance, and performance needs.

Configure Bucket Policies (data.tf)

The other resource defined in this file is the bucket policy authorizing read/write actions for the EKS cluster:

# Bucket Policy
data "aws_iam_policy_document" "bucket_policy" {
    version = "2012-10-17"  

    statement {
        sid = "EKSReadWrite"
        effect = "Allow"

        principals {
            type = "AWS"
            identifiers = [var.eks_node_group_role_arn]
        }

        actions = [
            "s3:GetObject",
            "s3:PutObject"
        ]

        resources = [aws_s3_bucket.bucket.arn, "${aws_s3_bucket.bucket.arn}/*"]
    }
}

This bucket policy ensures that EKS worker nodes can read and write to the bucket while restricting access to only the authorized AWS service. In an enterprise-level implementation, additional permission statements would likely be required to meet security, performance, and business objectives.

Outputs (outputs.tf)

This file exposes important values for consumption:

output "s3_bucket_name" {
    description = "The name of the s3 bucket."
    value       = aws_s3_bucket.bucket.id
}

Other modules in future development (e.g., observability, application services) can reference this output for storage access.

Defining Terraform Behavior (terraform.tf)

terraform {
   ...
}

This file specifies the Terraform settings for this module, including the required CLI version and providers. While these configurations can differ across modules in more complex architectures, they remain unchanged from the tagging module for this small POC, so I won’t reproduce the code here.

🟢 How Data Flows Through the Module

  1. Users define the bucket name.

  2. The module creates an S3 bucket and applies a bucket policy.

  3. Outputs provide references for future modules to consume.

🟢 What’s Next?

Now that we’ve built and structured all of our modules, we’ll:

  • Move on to defining environment specific configurations and wiring the modules together.


🌲 Environment-Specific Configurations (`tfvars` Directory)

Managing infrastructure across multiple environments requires flexibility. Hardcoding values in Terraform files leads to inflexibility and duplication, making it difficult to scale.

The tfvars directory solves this by dynamically passing environment-specific variables into Terraform configurations. This ensures clean, reusable, and adaptable infrastructure.

🟢 Directory Structure

/tfvars
├── dev-us-east-2.tfvars  # Configuration for the development environment
├── prod-us-east-2.tfvars # Configuration for the production environment

Each file contains key-value pairs specific to its environment, allowing Terraform to deploy the same infrastructure with different configurations.

Example: dev-us-east-2.tfvars

# Account Variables
target_aws_account = "123456789012"  # Place holder value, not a real account

# Environment Variables
region      = "us-east-2"
environment = "Dev"

# Network Variables
vpc_cidr = "10.0.0.0/22"

Example: prod-us-east-2.tfvars

# Account Variables
target_aws_account = "210987654321"  # Place holder value, not a real account

# Environment Variables
region      = "us-east-2"
environment = "Prod"

# Network Variables
vpc_cidr = "10.10.0.0/22"

# Compute Variables
eks_instance_types = ["t3.medium"]
eks_desired_size   = 2
eks_max_size       = 3
eks_min_size       = 2

🟢 How These Variables Are Used in Terraform

To apply Terraform configurations with environment-specific settings, we use the -var-file flag:

terraform apply -var-file=./tfvars/dev-us-east-2.tfvars

This approach keeps our core Terraform code unchanged, ensuring:

  • Consistency – Infrastructure follows the same structure across environments.

  • Simplified Maintenance – Externalized configurations reduce duplication and manual updates.

  • Scalability – Adding new environments is seamless without modifying Terraform modules.

🟢 What’s Next?

Now that we’ve defined our environment-specific variables, the next step is to:

  • Configure our Root files to orchestrate module interactions.

  • Validate our full configuration to ensure everything is working as expected.


🎬 Root Files Overview

Orchestrating Infrastructure: Understanding the Root Files

At the heart of every Terraform project is a set of core configuration files that define how infrastructure is deployed, managed, and interconnected. These root files serve as the entry point, orchestrating module execution, variable management, and provider configurations.

Think of the root directory as Terraform’s command center—it doesn’t create resources directly but instead coordinates modules, inputs, and outputs to build a scalable and maintainable infrastructure.

🟢 Root Directory Structure

Each of the following files plays a crucial role in ensuring consistency, reusability, and automation across environments.

/poc
  ├── locals.tf    # Logic for grouping variables for module consumption
  ├── main.tf      # Root file for initializing infrastructure
  ├── outputs.tf   # Infrastructure component outputs
  ├── provider.tf  # Configures provider settings to be used throughout the configuration
  ├── terraform.tf # Configures terraform behavior, required providers, and backend configurations
  ├── variables.tf # Global variables shared across environments

🟢 Root Files Breakdown

Global Variables File (variables.tf)

This file defines global input variables:

### Account Variables ###
variable "target_aws_account" {
    description = "Target AWS account for deployment."
    type        = string
}

### GitLab Variables ###
# Imported into configuration from GitLab Environment variables during CI-Pipeline
variable "source_gitlab_project" {
    description = "The source gitlab structure project hosting the terraform code to deploy this resource."
    type        = string
    default     = "POC"
}

### Environment Variables ###
variable "region" {
    description = "The AWS region resources will be deployed in."
    type        = string
    validation {
        condition     = contains(["us-east-2", "us-west-2"], var.region)
        error_message = "Variable var.region is not an allowed region. Must be one of [us-east-2, us-west-2]"
    }
}

variable "environment" {
    description = "The AWS region resources will be deployed in."
    type        = string
    validation {
        condition     = contains(["Dev", "QA", "Stg", "Prod"], var.environment)
        error_message = "Variable var.environment is not a valid environment name. Must be one of [Dev, QA, Stg, Prod]"
    }
}

### Project Variables ###
variable "project" {
    description = "The project this resource belongs to."
    type = string
    default = "dev-lab"
}

### Network Variables ###
variable "vpc_cidr" {
    description = "The cidr block for the VPC."
    type        = string
}

### Compute Variables ###
variable "eks_instance_types" {
    description = "The instance types for the EKS node group worker nodes."
    type        = list(string)
    default     = ["t3.micro"]
}

variable "eks_desired_size" {
    description = "The desired number of worker nodes."
    type        = number
    default     = 1
}

variable "eks_max_size" {
    description = "The maximum number of worker nodes."
    type        = number
    default     = 1
}

variable "eks_min_size" {
    description = "The minimum number of worker nodes."
    type        = number
    default     = 1
}

This file serves as a central location for defining variables that are passed to each module. Variables without default values must be specified in the environment-specific tfvars files; otherwise, the configuration will fail due to missing required values.

Variables with default values can either be overridden in the tfvars files or left undefined, in which case the default values will be inherited and passed to the modules. Additionally, conditions have been applied to the environment and region variables to ensure that only pre-authorized values are accepted.

Logical Variable Grouping (locals.tf)

This file is used to group variables for simpler and more efficient module consumption:

locals {
    eks_data = {
        cluster_name = "${var.project}-${lower(var.environment)}-eks"

        node_group = {
            instance_types = var.eks_instance_types
            desired_size   = var.eks_desired_size
            max_size       = var.eks_max_size
            min_size       = var.eks_min_size
        }
    }
}

Instead of passing each compute variable individually, we group them into a single object and pass it to the compute module. This approach also allows us to apply additional logic to generate values that are composites of multiple variables.

Terraform Entry Point (main.tf)

This file is responsible for initializing and orchestrating the Terraform modules:

# Tagging Module
module "tags" {
    source = "./modules/tagging"

    contact               = "jared@infra-insider.com"
    environment           = var.environment
    project               = var.project
    source_gitlab_project = var.source_gitlab_project
}

# Network Module
module "network" {
    source = "./modules/network"

    vpc_cidr = var.vpc_cidr

    tags = module.tags.tags
}

# Compute Module
module "compute" {
    source = "./modules/compute"

    eks_data          = local.eks_data
    vpc_id            = module.network.vpc_id
    subnet_id_public  = module.network.subnet_id_public
    subnet_id_private = module.network.subnet_id_private

    tags = module.tags.tags
}

# Storage Module
module "storage" {
    source = "./modules/storage"

    s3_bucket_name          = "${var.project}-${var.region}-${lower(var.environment)}-eks-bucket"
    eks_node_group_role_arn = module.compute.eks_node_group_role_arn

    tags = module.tags.tags
}

This file orchestrates the module calls, demonstrating how outputs from one module serve as inputs for the next in the infrastructure stack. It automates module execution, manages dependencies, ensures a structured deployment process, and facilitates seamless data passing between modules.

Resource Outputs (outputs.tf)

This file outputs key infrastructure components, making them available for use by other processes within the organization:

output "vpc_id" {
    description = "The id for the vpc."
    value       = module.network.vpc_id
}

output "eks_endpoint" {
    description = "The EKS cluster endpoint."
    value       = module.compute.eks_endpoint
}

output "s3_bucket_name" {
    description = "The name of the s3 bucket."
    value       = module.storage.s3_bucket_name
}

The VPC Id, EKS Endpoint, and S3 Bucket name are all values that are exposed for org consumption.

Provider Configuration (providers.tf)

This file defines the cloud provider settings:

provider "aws" {
    region = var.region
}

The provider configurations handle authentication and manage provider-specific settings. In enterprise-level implementations spanning multiple regions, services (Kubernetes, Helm), and even multiple clouds, this file includes multiple provider definitions along with their respective configurations and authentications.

Defining Terraform Behavior (terraform.tf)

This file defines Terraform’s behavior within this repository, the cli version, the required providers, and the backend configuration:

terraform {
    required_providers {
        aws = {
            source  = "hashicorp/aws"
            version = "~> 5.0"
        }
    }

    required_version = ">= 1.10.5"

    backend "s3" {
        region               = "us-east-2" 
        bucket               = "my-tfstate-bucket" 
        key                  = "poc" 
        workspace_key_prefix = "environments"
        dynamodb_table       = "my-tfstate-table"
        encrypt              = true
        kms_key_id           = "my-tfstate-kms-key-id"
    }
}

In this simple POC, the backend values are hardcoded. However, in an enterprise-level implementation with multiple state files and remote backends segregated by organizational OUs, these values must be dynamically interpolated. There are multiple ways to achieve this, which we will explore in a future blog.

🟢 What’s Next?

With our root files structured and modules orchestrated, our Terraform configuration is now modular, reusable, and ready for deployment. These foundational files ensure consistency, scalability, and maintainability across environments—without unnecessary complexity.

Now, it’s time to put our setup to the test. Next, we’ll:

  • Validate our infrastructure to ensure everything is correctly configured.

  • Run terraform plan to preview the changes Terraform will make.

  • Prepare for deployment by verifying dependencies across modules.

We’re at the finish line—let’s make sure everything is solid before deploying to the cloud!


🚀 Wrapping Up: What We’ve Built & What’s Next

We’ve just laid the groundwork for scalable, modular Terraform infrastructure—turning complex cloud deployments into clean, reusable, and efficient code. Along the way, we’ve:

  • Defined a standard tagging module to keep infrastructure organized.

  • Built a robust networking layer with a VPC, subnets, and secure routing.

  • Provisioned an EKS cluster and node groups for scalable compute power.

  • Configured an S3 storage solution with secure access policies.

  • Implemented environment-specific configurations to make deployments flexible.

  • Structured the root files to orchestrate infrastructure across modules.

At this point, our Terraform project is structured, modular, and ready to scale!

🟢 Validating Our Configuration

Before we deploy, we need to validate that everything works as expected. Running terraform validate ensures our syntax is correct, while terraform plan previews the exact changes Terraform will make in AWS:

terraform validate
terraform plan -var-file=./tfvars/dev-us-east-2-v.tfvars

For now, these steps are manual, but what if Terraform could run these checks automatically every time we push code? As any good DevOps engineer knows—infrastructure that isn’t automated is infrastructure that can break.

🟢 What’s Next? Automating Terraform Deployments

While manually running Terraform works, it doesn’t scale. In our next post, we’ll take Terraform automation to the next level by introducing GitLab CI/CD for infrastructure as code.

➡️ Coming Up Next:

  • Setting up a GitLab CI/CD pipeline to manage Terraform deployments.

  • Automating validation (terraform validate and terraform plan).

  • Introducing remote state management for team collaboration.

But we’re not stopping there! In future blogs, we’ll:

  • Expand this POC into an enterprise-ready infrastructure, covering best practices for scalability and governance.

  • Deep dive into individual modules to optimize networking, security, and compute efficiency.

  • Introduce new modules to enhance our cloud architecture.

🎥 Want a walkthrough of everything we covered? I’ll be releasing a YouTube video soon, where I’ll break down this blog step by step. Stay tuned!

📂 You can view the full code repository here!

GitLab

We’ve laid the foundation—now it’s time to bring automation and scalability into the mix. Stay tuned as we take modular Terraform to the next level with continuous delivery! 🚀