Provision an AWS EKS Cluster with Terraform

In this post we’ll provision an AWS Elastic Kubernetes Service (EKS) Cluster using Terraform. EKS is an upstream compliant Kubernetes solution that is fully managed by AWS.

I have provided a sample Terraform script at here. It will build a multi-AZ EKS cluster that looks like this:

Specifically, we’ll be launching the following AWS resources:

  • 1x new VPC for hosting the EKS cluster
  • 3x private subnets (across 3x different AZ) for the EKS worker nodes
  • 3x public subnets for hosting ELBs (mapped to EKS external Load Balancer services)
  • 1x NAT Gateway for Internet access and publishing external services
  • 2x Auto-Scaling Groups for 2x EKS worker groups, with different IAM instance sizes (each ASG is set to a desired capacity of 2x, so we’ll get a total of 4x worker nodes)
  • 2x Security Groups attached to the 2x ASGs for management access

PREREQUISITES

  • Access to an AWS testing environment
  • Install Git, Terraform & Kubectl on your client
  • Install AWS toolkits including AWS CLI, AWS-IAM-Authenticator
  • Check the NTP clock & sync status on your client —> important!
  • Clone the Terraform Repo
git clone https://github.com/sc13912/tf-aws-eks.git

Step-1: Set the AWS Environment Variables and run the Terraform script

[root@cloud-ops01 tf-aws-eks]# aws configure
AWS Access Key ID [*****]: 
AWS Secret Access Key [***]: 
Default region name [us-east-1]: 
Default output format [json]:
terraform init
terraform apply

The process will take about 10~15 mins and your Terraform output should look like this:

Register the cluster and update the kubeconfig file with the correct cluster name.

[root@cloud-ops01 tf-aws-eks]# aws eks --region us-east-1 update-kubeconfig --name demo-eks-zUqzVyxb
Added new context arn:aws:eks:us-east-1:979459205431:cluster/demo-eks-zUqzVyxb to /root/.kube/config

Step-2: Verify the EKS Cluster status

Verify we can access the EKS cluster and the 4x worker nodes that have just been created.

[root@cloud-ops01 tf-aws-eks]# kubectl get nodes
NAME                         STATUS   ROLES    AGE   VERSION
ip-10-0-1-113.ec2.internal   Ready    <none>   43m   v1.16.8-eks-e16311
ip-10-0-1-40.ec2.internal    Ready    <none>   43m   v1.16.8-eks-e16311
ip-10-0-2-26.ec2.internal    Ready    <none>   43m   v1.16.8-eks-e16311
ip-10-0-3-23.ec2.internal    Ready    <none>   43m   v1.16.8-eks-e16311

Run kubectl describe nodes and we can see each node has been tagged with a few customised labels based on its unique properties. These are important metadata which can be used for selective Pod/Node deployment and other use cases like affinity or anti-affinity rules.

Now log into the AWS console, navigate to EC2 —> Auto Scaling —> Auto Scaling Groups, you’ll find the two ASGs that have been provisioned by Terraform.

Now check the EC2 instances, we should have 2+2 work nodes with different ASG instance sizes, and they should be randomly distributed across all 3x AZs.

Step-3: Deploy Kubernetes Add-on Services

  • Install Metrics-Server to provide cluster-wide resource metrics collection and to support use cases such as Horizontal Pod Autoscaling (HPA)
[root@cloud-ops01 tf-aws-eks]# kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml 

wait a for a few seconds and verify we now have resource stats

[root@cloud-ops01 tf-aws-eks]# kubectl top nodes
NAME                         CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
ip-10-0-1-113.ec2.internal   88m          9%     417Mi           27%       
ip-10-0-1-40.ec2.internal    126m         6%     600Mi           17%       
ip-10-0-2-26.ec2.internal    360m         18%    760Mi           22%       
ip-10-0-3-23.ec2.internal    84m          8%     454Mi           30%       
  • Next, deploy a NGINX Ingress Controller so we can use L7 URL load balancing.
[root@cloud-ops01 tf-aws-eks]# kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-0.32.0/deploy/static/provider/aws/deploy.yaml

verify the Ingress pods and services are running

[root@cloud-ops01 tf-aws-eks]# kubectl get pods -n ingress-nginx 
NAME                                        READY   STATUS      RESTARTS   AGE
ingress-nginx-admission-create-2fvlb        0/1     Completed   0          103s
ingress-nginx-admission-patch-4tvnk         0/1     Completed   0          102s
ingress-nginx-controller-5cc4589cc8-7fr64   1/1     Running     0          117s
[root@cloud-ops01 tf-aws-eks]# 
[root@cloud-ops01 tf-aws-eks]# kubectl get svc -n ingress-nginx  
NAME                                 TYPE           CLUSTER-IP       EXTERNAL-IP                                                                     PORT(S)                      AGE
ingress-nginx-controller             LoadBalancer   172.20.114.166   aaa1d4619924247688fc4eeb4f85cd48-76f9a6f87fe42022.elb.us-east-1.amazonaws.com   80:31060/TCP,443:31431/TCP   2m2s
ingress-nginx-controller-admission   ClusterIP      172.20.3.211     <none>  
  • In addition, we’ll deploy some storage classes (with different I/O specification) to provide dynamic persistent storage required for stateful pods and services.
[root@cloud-ops01 tf-aws-eks]# kubectl apply -f ./storage/storageclass/  
storageclass.storage.k8s.io/fast-50 created
storageclass.storage.k8s.io/standard created
  • Optionally, we can deploy the Kubernetes dashboard for some basic UI visibility.
[root@cloud-ops01 tf-aws-eks]# kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/master/aio/deploy/recommended.yaml  
[root@cloud-ops01 tf-aws-eks]# kubectl apply -f ./kube-dashboard/  

Retrieve the dashboard token.

[root@cloud-ops01 tf-aws-eks]# SA_NAME=admin-user  
[root@cloud-ops01 tf-aws-eks]# kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep ${SA_NAME} | awk '{print $1}')  
... 
token:      eyJhbGciOiJSUzI1NiIsImtpZCI6IlFsUzNqaW9iNFVsXy1BNlppdk9YZVVDZkFxMTJqeGMtSlA0LXN5QjZDdkkifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJhZG1pbi11c2VyLXRva2VuLTliODdiIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImFkbWluLXVzZXIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiJjNjkwZTk5Zi0zM2ViLTRlZjctYTA2Ny03MDVjMTE3ODI1NjUiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZS1zeXN0ZW06YWRtaW4tdXNlciJ9.h1a_8pySJ7hebSci-mP8tPXmCY0vCQOCKzeDKICDMEE4Qlt-FGSwoBMEzTLLcA2-MUtDjkzbjJlFPZMl2EsiaxPbP63_yn_0l4hZqMdM4nKjvrtVCXUvY9fJOREj3lNvG4Uy1QiyU3pgKbUKdFpvSYPVPGmqq_hFTc5U9KXwk_bBgIIJr9S2a8_yIvchMtTrsxdh3O1P-AeP5Bd5FZJSG9QeI2z1guD8ewWOa2W4Z5E4wKZ10yVVslhh_OcQgQ2eBvtDD6_mrDwSs1tQUbY83jbHR7yYOTYmz-v2EnLWb3cUbO8u3EHL_qWjRTPcMTuH9RLZwTf7CLH6RYoEVlUvLw

Get the dashboard LB service address.

[root@cloud-ops01 tf-aws-eks]# kubectl get svc kubernetes-dashboard  -n kubernetes-dashboard  
NAME                   TYPE           CLUSTER-IP      EXTERNAL-IP                                                              PORT(S)         AGE
kubernetes-dashboard   LoadBalancer   172.20.62.179   aa2e50aa1703d4163a87c1dbe5bab77a-723142651.us-east-1.elb.amazonaws.com   443:30822/TCP   6m46s

Point to the URL in the browser, copy & paste the token for authentication and you should land on a dashboard page like this:

Step-4: Deploy sample apps on the EKS cluster for testing

  • Firstly, deploy the provided sample Guestbook app to verify the persistent storage setup.
[root@cloud-ops01 tf-aws-eks]# kubectl create ns guestbook-app  
[root@cloud-ops01 tf-aws-eks]# kubectl apply -f ./demo-apps/guestbook/    

The application requests 2x persistent volumes (PV) for the redis-master and redis-slave pods. Both PVs should be automatically provisioned by the persistent volume claims (PVC) with the 2x different storage classes as we deployed earlier. You should see the STATUS reported as “Bound” between each PV and PVC mapping.

[root@cloud-ops01 tf-aws-eks]# kubectl get pvc -n guestbook-app 
NAME                 STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
redis-master-claim   Bound    pvc-ad5310a6-249f-4526-9ed6-0596b70fa171   2Gi        RWO            standard       38m
redis-slave-claim    Bound    pvc-a3e97098-600a-4ede-bc4a-e9235602d42c   4Gi        RWO            fast-50        38m

Again, retrieve the external IP/DNS for the frontend service for the Guestbook app.

[root@cloud-ops01 storageclass]# kubectl get svc -n guestbook-app 
NAME           TYPE           CLUSTER-IP       EXTERNAL-IP                                                               PORT(S)        AGE
frontend       LoadBalancer   172.20.19.131    a9e282a1efb6b4f97a288e183c68ac82-2013066277.us-east-1.elb.amazonaws.com   80:32578/TCP   45m

You should be able to access the Guestbook now. Enter and submit some messages, and try to destroy and re-redeploy the app, your data will be kept by the redis PVs.

  • Next, we’ll deploy a modified version of the yelb app to test the NGINX ingress controller
[root@cloud-ops01 tf-aws-eks]# kubectl create ns yelb  
[root@cloud-ops01 tf-aws-eks]# kubectl apply -f ./demo-apps/yelb/  

Retrieve the external DNS address for the ingress service within the yelb namespace. Notice the ingress URL path is defined as “yelb.local”. Next we’ll need to get the public IP of the ingress service and then update the local host file for a quick testing.

[root@cloud-ops01 tf-aws-eks]# kubectl get ingresses -n yelb
NAME           HOSTS        ADDRESS                                                                         PORTS   AGE
yelb-ingress   yelb.local   a8821ed5391434981a35cd6599ed7671-a0d9702f226e21d8.elb.us-east-1.amazonaws.com   80      63m

Run nslookup to get the public IP of the ingress service, then update the local host file.

Non-authoritative answer:
Name:   a8821ed5391434981a35cd6599ed7671-a0d9702f226e21d8.elb.us-east-1.amazonaws.com
Address: 54.175.25.189

[root@cloud-ops01 tf-aws-eks]# echo "54.175.25.189  yelb.local" >> /etc/hosts      

We should have access to the app now.