Getting Started Deploying an AWS ECS Cluster With Terraform

Hetakshi Patil
9 min readJun 7, 2021

One of my 2021 goals was to become more familiar with to creating a Terraform modules for Amazon Web Services (AWS). And I am happy to announce that I have achieved it. I have cleared my HashiCorp Certified: Terraform Associate Certification.

In this article I will show you how to create an AWS Elastic Container Service(ECS) using the Terraform as an Infrastructure as Code tool.

Today’s demo will utilize two major cloud computing tools.

Terraform is an infrastructure orchestration tool also known as “infrastructure as code(IaC)”. Using Terraform, you declare every single piece of your infrastructure once, in static files, allowing you to deploy and destroy cloud infrastructure easily, make incremental changes to the infrastructure, do rollbacks, infrastructure versioning, etc.

Amazon created an innovative solution for deploying and managing a fleet of virtual machines — AWS ECS. Under the hood, ECS utilizes AWSs’ well-known concept of EC2 virtual machines, as well as CloudWatch for monitoring them, auto scaling groups for provisioning and deprovisioning machines depending on the current load of the cluster, and most importantly — Docker as a containerization engine.

Below is the architecture that we will design.

We will create a VPC (Virtual Private Cloud) which will contain an Autoscaling group with EC2 instances. ECS(Amazon Elastic Container Service) will manage the task that will take place on the EC2 instance based on Docker images stored in ECR (Elastic Container Registry).

Each EC2 instance will serve as a host for a worker that writes something to RDS MySQL. EC2 and MySQL instances will be store in different security group.

Here is a list of all the AWS services that will be part of the building block:

  • VPC with a public subnet as an isolated pool for my resources
  • Internet Gateway to contact the outer world
  • Security groups for RDS MySQL and for EC2s
  • Auto-scaling group for ECS cluster with launch configuration
  • RDS MySQL instance
  • ECR container registry
  • ECS cluster with task and service definition

Terraform Installation

To start with Terraform we need to install it. Just go along with the steps in this document: https://www.terraform.io/downloads.html

Verify the installation by typing:

$ terraform --version
Terraform v0.13.4

With Terraform (version 0.13.4) we can provision cloud architecture by writing code which is usually created in a programming language. In this case it’s going to be HCL — a HashiCorp configuration language.

Terraform State

Before writing the first line of our code lets focus on understanding what is the Terraform state.

The state is a kind of a snapshot of the architecture. Terraform needs to know what was provisioned, what are the resources that were created, track the changes, etc.

All that information is written either to a local file terraform.state or to a remote location. Generally the code is shared between members of a team, therefore keeping local state file is never a good idea. We want to keep the state in a remote destination. When working with AWS, this destination is s3.

This is the first thing that we need to code — tell terraform that the state location will be remote and kept is s3 (terraform.tf):

This code will allow Terraform to store the state file in a S3 bucket called “terraform-s3-state-bucket”

Terraform will keep the state in an s3 bucket under a state.tfstate key. In order that to happen we need to set up three environment variables:

$ export AWS_SECRET_ACCESS_KEY=...
$ export AWS_ACCESS_KEY_ID=..
$ export AWS_DEFAULT_REGION=...

These credentials can be found/created in AWS IAM Management Console in “My security credentials” section. Both access keys and region must be stored in environment variables if we want to keep the remote state.

Virtual Private Cloud

vpc.tf

Terraform needs to know with which API should interact. Here we say it’ll be AWS. List of available providers can be found here: https://www.terraform.io/docs/providers/index.html

The provider section has no parameters because we’ve already provided the credentials needed to communicate with AWS API as environment variables in order have remote Terraform state.

The resource block type aws_vpc with name vpc creates Virtual Private Cloud — a logically isolated virtual network. When creating VPC we must provide a range of IPv4 addresses. It’s the primary CIDR block for the VPC and this is the only required parameter.

Parameters enable_dns_support and enable_dns_hostnames are required if we want to provision database in our VPC that will be publicly accessible

Internet Gateway

ig.tf

In order to allow communication between instances in our VPC and the internet we need to create Internet gateway.

The only required parameter is a previously created VPC id that can be obtain by invoking aws_vpc.vpc.id this is a terraform way to get to the resource details: resource.resource_name.resource_parameter.

Subnet

Within the VPC let’s add a public subnet:

subnet.tf

To create a subnet we need to provide VPC id and CIDR block. Additionally we can specify availability zone, but it’s not required.

Route Table

Route table allows to set up rules that determine where network traffic from our subnets is directed. Let’s create new, custom one, just to show how it can be used and associated with subnets.

routetable.tf

What we did is created a route table for our VPC that directs all the traffic (0.0.0.0/0) to the internet gateway and associate this route table with both subnets. Each subnet in VPC have to be associated with a route table.

Security Groups

Security groups works like a firewalls for the instances (where ACL works like a global firewall for the VPC). Because we allow all the traffic from the internet to and from the VPC we might set some rules to secure the instances themselves.

We will have two instances in our VPC — cluster of EC2s and RDS MySQL, therefore we need to create two security groups.

sg.tf

First security group is for the EC2 that will live in ECS cluster. Inbound traffic is narrowed to two ports: 22 for SSH and 443 for HTTPS needed to download the docker image from ECR.

Second security group is for the RDS that opens just one port, the default port for MySQL — 3306. Inbound traffic is also allowed from ECS security group, which means that the application that will live on EC2 in the cluster will have permission to use MySQL.

Inbound traffic is allowed for any traffic from the Internet (CIDR block 0.0.0.0/0). In real life case there should be limitations, for example, to IP ranges for a specific VPN.

This ends setting up the networking park of our architecture. Now it’s time for autoscaling group for a EC2 instances in ECS cluster.

Autoscaling Group

Autoscaling group is a collection of EC2 instances. The number of those instances is determined by scaling policies. We will create autoscaling group using a launch template.

Before we will launch container instances and register them into a cluster, we have to create an IAM role for those instances to use when they are launched:

iam.tf

Having IAM role we can create an autoscaling group from template:

autoscaling.tf

If we want to use a created, named ECS cluster we have to put that information into user_data, otherwise our instances will be launched in default cluster.

Basic scaling information is described by aws_autoscaling_group parameters. Autoscaling policy has to be provided, we will do it later.

Having autoscaling group set up we are ready to launch our instances and database.

Database Instance

Having prepared subnet and security group for RDS we need one more thing to cover before launching the database instance. To provision a database we need to follow some rules:

  • Our VPC has to have enabled DNS hostnames and DNS resolution (we did that while creating VPC).
  • Our VPC has to have a DB subnet group (that is about to happen).
  • Our VPC has to have a security group that allows access to the DB instance.
db.tf

All the parameters are more less self explanatory. If we want our database to be publicly accessible you have to set the publicly_accessible parameter as true.

Elastic Container Service

ECS is a scalable container orchestration service that allows to run and scale dockerized applications on AWS.

To launch such an application we need to download image from some repository. For that we will use ECR. We can push images there and use them while launching EC2 instances within our cluster:

ecr.tf
ecs.tf

Cluster name is important here, as we used it previously while defining launch configuration. This is where newly created EC2 instances will live.

To launch a dockerized application we need to create a task — a set of simple instructions understood by ECS cluster. The task is a JSON definition that can be kept in a separate file:

task_definition.json.tpl

n a JSON file we define what image will be used using template variable provided in a template_file data resource as repository_url tagged with latest. 512 MB of RAM and 2 CPU units that is enough to run the application on EC2.

Having this prepared we can create terraform resource for the task definition:

template_file.tf

The family parameter is required and it represents the unique name of our task definition.

The last thing that will bind the cluster with the task is a ECS service. The service will guarantee that we always have some number of tasks running all the time:

ecs_service.tf

This ends the terraform description of an architecture.

There’s just one more thing left to code. We need to output the provisioned components in order to use them in worker application.

We need to know URLs for:

  • ECR repository
  • MySQL host

Terraform provides output block for that. We can print to the console any parameter of any provisioned component.

output.tf

Applying the Changes

Its now time to initialize our directory by typing Terraform init

This command will initialize the directory containing a Terraform configuration. The initialization verifies the state backend and downloads modules, plugins and providers.

Below is the result that I received after running Terraform init

No errors! We are now free to proceed!

We should now be able to run Terraform apply to start executing the changes. Please note that this step will take a lil while. It took 16 minutes to create the MySQL instance, we are now able to see that everything worked. We are also able to see the outputs.

Cleanup time

In order to save some money, we are going to destroy this this lab.

Run Terraform destroy, you should get the same result that I received below.

All set!

__________________________________________________

Hetakshi Patil

Platform Engineer | Quantiphi Inc. | US and India

http://www.quantiphi.com | Analytics is in our DNA

___________________________________________________

--

--

Hetakshi Patil

Platform Engineer at Quantiphi | 2X AWS Certified | GCP | Azure | DevOps | Python