This is a companion repo for the series of videos I created below. Please note that this codebase is for educational purposes only and is NOT guaranteed to be the best implementation for production purposes. For more technical discussions on Kubernetes, CI/CD, and software engineering in general, please visit relaxdiego.com.
First, ensure that you've configured your AWS CLI accordingly. Setting that up is outside the scope of this guide so please go ahead and read up at https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html
Grab the latest Terraform CLI here
Grab it via this guide
Grab it via this guide
Grab it via this guide
cd <PROJECT-ROOT>
terraform -chdir=terraform init
cp terraform/example.tfvars terraform/terraform.tfvars
Then modify the file as you see fit.
secrets_dir=~/.relaxdiego/system-design/secrets
mkdir -p $secrets_dir
chmod 0700 $secrets_dir
aws_profile=$(grep -E ' *aws_profile *=' terraform/terraform.tfvars | sed -E 's/ *aws_profile *= *"(.*)"/\1/g')
aws_region=$(grep -E ' *aws_region *=' terraform/terraform.tfvars | sed -E 's/ *aws_region *= *"(.*)"/\1/g')
cluster_name=$(grep -E ' *cluster_name *=' terraform/terraform.tfvars | sed -E 's/ *cluster_name *= *"(.*)"/\1/g')
db_creds_secret_name=${cluster_name}-db-creds
db_creds_secret_file=${secrets_dir}/${cluster_name}-db-creds.json
cat > $db_creds_secret_file <<EOF
{
"db_user": "SU_$(uuidgen | tr -d '-')",
"db_pass": "$(uuidgen)"
}
EOF
chmod 0600 $db_creds_secret_file
aws secretsmanager create-secret \
--profile "$aws_profile" \
--name "$db_creds_secret_name" \
--description "DB credentials for ${cluster_name}" \
--secret-string file://$db_creds_secret_file
First, get a hold of an FQDN that you own and define it in an env var:
route53_zone_fqdn=<TYPE-IN-YOUR-FQDN-HERE>
Let's also create a unique caller reference:
route53_caller_reference=$(uuidgen | tr -d '-')
Then, create the zone:
aws_profile=$(grep -E ' *aws_profile *=' terraform/terraform.tfvars | sed -E 's/ *aws_profile *= *"(.*)"/\1/g')
aws_region=$(grep -E ' *aws_region *=' terraform/terraform.tfvars | sed -E 's/ *aws_region *= *"(.*)"/\1/g')
aws route53 create-hosted-zone \
--profile "$aws_profile" \
--name "$route53_zone_fqdn" \
--caller-reference "$route53_caller_reference" > tmp/create-hosted-zone.out
List the nameservers for your zone:
cat tmp/create-hosted-zone.out | jq -r '.DelegationSet.NameServers[]'
Now modify your DNS servers to use the hosts listed above.
terraform -chdir=terraform apply
Proceed once the above is done.
Use ssh4realz to ensure you connect to the bastion securely. For a guide on how to (and why) use the script, see this video.
ssh4realz $(terraform -chdir=terraform output -raw bastion_instance_id)
With the bastion's host key already saved to your known_hosts file, just SSH directly to its public ip.
ssh -A ubuntu@$(terraform -chdir=terraform output -raw bastion_public_ip)
Back in your local machine
aws eks --region=$(terraform -chdir=terraform output -raw aws_region) \
update-kubeconfig \
--dry-run \
--name $(terraform -chdir=terraform output -raw k8s_cluster_name) \
--alias $(terraform -chdir=terraform output -raw cluster_name) | \
sed -E "s/^( *(cluster|name)): *arn:.*$/\1: $(terraform -chdir=terraform output -raw cluster_name)/g" \
> ~/.kube/config
kubectl config use-context $(terraform -chdir=terraform output -raw cluster_name)
chmod 0600 ~/.kube/config
Check that you're able to connect to the kube-api-server:
kubectl get pods --all-namespaces
# Print out the DB endpoint for reference
terraform -chdir=terraform output db_endpoint
kubectl run -i --tty --rm debug --image=busybox --restart=Never -- sh
Once in the prompt, run:
/ # telnet <HOSTNAME-PORTION-OF-db_endpoint-OUTPUT> <PORT-PORTION-OF-db_endpoint-OUTPUT>
It should output:
Connected to <HOSTNAME>
To exit:
<Press Ctrl-] then Enter then e>
/ # exit
aws ecr get-login-password --region $(terraform -chdir=terraform output -raw aws_region) | \
docker login --username AWS --password-stdin $(terraform -chdir=terraform output -raw registry_frontend)
aws ecr get-login-password --region $(terraform -chdir=terraform output -raw aws_region) | \
docker login --username AWS --password-stdin $(terraform -chdir=terraform output -raw registry_api)
OIDC will be used by some pods in the cluster to connect to the AWS API. This section will be based off of this guide
First check if the cluster already has an OIDC provider:
aws eks describe-cluster \
--region $(terraform -chdir=terraform output -raw aws_region) \
--name $(terraform -chdir=terraform output -raw k8s_cluster_name) \
--query "cluster.identity.oidc.issuer" \
--output text
It should return something like:
https://oidc.eks.us-west-2.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E
Now grep that sample ID from your list of OIDC providers:
aws iam list-open-id-connect-providers | grep <EXAMPLED539D4633E53DE1B716D3041E>
If the above command returned an ARN, you're done with this section. If it did not return one, then run:
eksctl utils associate-iam-oidc-provider \
--region $(terraform -chdir=terraform output -raw aws_region) \
--cluster $(terraform -chdir=terraform output -raw k8s_cluster_name) \
--approve
Rerun the aws iam command above again (including the pipe to grep) to double check. It should return a value this time.
While AWS has its own certificate management service, we will work with cert-manager for this exercise just so that we'll also learn how to use it. I mean, we're already learning stuff, so we might as well!
kubectl apply --validate=false -f apps/cert-manager/cert-manager.yaml
Watch for the status of each cert-manager pod via:
watch -d kubectl get pods -n cert-manager
The AWS LB Controller allows us to create an AWS ALB by simply creating Ingress resources in our k8s cluster thereby exposing our front end and API services to the world. We will base the following steps on this guide
aws iam create-policy \
--policy-name AWSLoadBalancerControllerIAMPolicy \
--policy-document file://apps/aws-lb-controller/iam-policy.json | \
tee tmp/aws-load-balancer-controller-iam-policy.json
aws_account_id=$(terraform -chdir=terraform output -raw aws_account_id)
k8s_cluster_name=$(terraform -chdir=terraform output -raw k8s_cluster_name)
eksctl create iamserviceaccount \
--cluster="$k8s_cluster_name" \
--namespace=kube-system \
--name=aws-load-balancer-controller \
--attach-policy-arn=arn:aws:iam::${aws_account_id}:policy/AWSLoadBalancerControllerIAMPolicy \
--override-existing-serviceaccounts \
--approve
cat apps/aws-lb-controller/load-balancer.yaml | \
sed 's@--cluster-name=K8S_CLUSTER_NAME@'"--cluster-name=${k8s_cluster_name}"'@' | \
kubectl apply -f -
Watch the aws-load-balancer-controller-xxxxxx-yyyy pod go up via:
watch -d kubectl get pods -n kube-system
The following steps are based off of this guide, and this bit of a (working) hack:
Next, let's deploy the cluster issuer in another terminal:
route53_zone_fqdn=$(cat tmp/create-hosted-zone.out | jq -r '.HostedZone.Name' | rev | cut -c2- | rev)
route53_zone_id=$(cat tmp/create-hosted-zone.out | jq -r '.HostedZone.Id')
cluster_name=$(terraform -chdir=terraform output -raw cluster_name)
aws_region=$(terraform -chdir=terraform output -raw aws_region)
cert_manager_role_arn=$(terraform -chdir=terraform output -raw cert_manager_role_arn)
cat apps/cert-manager/cluster-issuer.yaml | \
sed 's@ROUTE53_ZONE_FQDN@'"${route53_zone_fqdn}"'@' | \
sed 's@ROUTE53_ZONE_ID@'"${route53_zone_id}"'@' | \
sed 's@CLUSTER_NAME@'"${cluster_name}"'@' | \
sed 's@AWS_REGION@'"${aws_region}"'@' | \
sed 's@CERT_MANAGER_ROLE_ARN@'"${cert_manager_role_arn}"'@' | \
kubectl apply -f -
Check that it created the secret for our app:
kubectl get secret ${cluster_name}-issuer-pkey -n cert-manager
kubectl create ns system-design
NOTE: This isn't a recommended approach for production environments. If you want something more robus, see this guide.
cluster_name=$(terraform -chdir=terraform output -raw cluster_name)
kubectl create secret generic postgres-credentials \
-n system-design \
--from-env-file <(jq -r "to_entries|map(\"\(.key)=\(.value|tostring)\")|.[]" ~/.relaxdiego/system-design/secrets/${cluster_name}-db-creds.json)
First, lets follow events in the system-design namespace to know what's happening when we apply our manifest later:
kubectl get events -n system-design -w
make api
In the other terminal session where you're watching events, wait for this line:
Issuing The certificate has been successfully issued
Be patient though as it can take a few minutes and you'll see errors like these:
Error presenting challenge: Time limit exceeded. Last error:
Failed build model due to ingress: system-design/ingress-sysem-design-api: none certificate found for host: api.<fqdn>
Ignore those. Check the status as well via:
https://check-your-website.server-daten.de/?q=${component}.${route53_zone_fqdn}
One downside to using cert-manager with AWS LB Controller is that they don't have a seamless integration at the time of writing. So once cert-manager is done creating the key and cert, we have to push them to ACM so that ALB can use them.
scripts/sync-tls-resources api
Once this script completes, the AWS LB Controller will be able to create the ALB. Next, we update the API DNS record to point to the ALB:
scripts/route53-recordset create api
make frontend
In the other terminal session where you're watching events, wait for this line:
Issuing The certificate has been successfully issued
Be patient though as it can take a few minutes and you'll see errors like these:
Error presenting challenge: Time limit exceeded. Last error:
Failed build model due to ingress: system-design/ingress-sysem-design-ui: none certificate found for host: ui.<fqdn>
Ignore those. Check the status as well via:
https://check-your-website.server-daten.de/?q=${component}.${route53_zone_fqdn}
One downside to using cert-manager with AWS LB Controller is that they don't have a seamless integration at the time of writing. So once cert-manager is done creating the key and cert, we have to push them to ACM so that ALB can use them.
scripts/sync-tls-resources ui
Once this script completes, the AWS LB Controller will be able to create the ALB. Next, we update the API DNS record to point to the ALB:
scripts/route53-recordset create ui
scripts/route53-recordset delete api
scripts/route53-recordset delete ui
kubectl delete ns system-design
scripts/delete-tls-resources api
scripts/delete-tls-resources ui
eksctl delete iamserviceaccount \
--cluster=$(terraform -chdir=terraform output -raw k8s_cluster_name) \
--namespace=kube-system \
--name=aws-load-balancer-controller
terraform -chdir=terraform destroy
aws iam delete-policy --policy-arn $(cat tmp/aws-load-balancer-controller-iam-policy.json | jq -r '.Policy.Arn')
If you no longer plan on bringing up this cluster at a later time, clean up the following as well:
aws_profile=$(grep -E ' *aws_profile *=' terraform/terraform.tfvars | sed -E 's/ *aws_profile *= *"(.*)"/\1/g')
aws_region=$(grep -E ' *aws_region *=' terraform/terraform.tfvars | sed -E 's/ *aws_region *= *"(.*)"/\1/g')
cluster_name=$(grep -E ' *cluster_name *=' terraform/terraform.tfvars | sed -E 's/ *cluster_name *= *"(.*)"/\1/g')
route53_zone_fqdn=$(cat tmp/create-hosted-zone.out | jq -r '.HostedZone.Name')
route53_caller_reference=$(uuidgen | tr -d '-')
aws secretsmanager delete-secret \
--force-delete-without-recovery \
--secret-id "${cluster_name}-db-creds"
aws route53 delete-hosted-zone \
--profile "$aws_profile" \
--name "$route53_zone_fqdn" \
--caller-reference "$route53_caller_reference" > tmp/create-hosted-zone.out