How I moved my Kubernetes project from DigitalOcean to Amazon EKS in 4 hours and why

I have to say it: DigitalOcean is fantastic. They have a nice friendly UI, they allow you to develop things in seconds, and they are very cheap. That's why I always tried to avoid Amazon AWS, and I thought of it as of something monstrous and weird.

I also have to say that my Kubernetes stack is not the most complicated one. It features a load balancer, several pods and services, and it uses ingress-nginx to work with Ingress objects. I also use cert-manager to issue TLS certificates.

> ~ kubectl get services --all-namespaces

NAMESPACE NAME
cert-manager cert-manager-webhook ClusterIP
default kubernetes ClusterIP
default nginx-ingress-controller LoadBalancer
default nginx-ingress-default-backend ClusterIP
default telescope-backend-micra-service ClusterIP
default telescope-backend-service ClusterIP
default telescope-communicator-service ClusterIP
default telescope-db-postgresql ClusterIP
default telescope-db-postgresql-headless ClusterIP
default telescope-db-postgresql-metrics ClusterIP
default telescope-db-postgresql-read ClusterIP
default telescope-frontend-service ClusterIP
default telescope-grafana-service ClusterIP
default telescope-prometheus-service ClusterIP
default telescope-prometheus-timescale-db-service ClusterIP
default telescope-redis-service ClusterIP
default telescope-timescale-db-service ClusterIP
kube-system kube-dns ClusterIP
kube-system tiller-deploy ClusterIP

And I thought I am fine, and I would never start using AWS, but I switched to it three days ago because of one small problem I experienced with DigitalOcean.

What happened?

In my previous article, I wrote about how I started using DNS-01 ACME validation instead of HTTP-01. The reasoning behind the choice of domain validation mechanism was quite simple — HTTP-01 validation didn't work on my cluster. And I had no idea why.

I quickly switched to DNS-01, and started thinking of the solution. I needed to use HTTP-01 because my I planned to support custom domains (it's when users can just create a CNAME record pointing to your website). The biggest difficulty in this situation is that I need to be able to issue HTTPS certificates for these custom domain names. DNS-01 validation sounds like a bad choice because instead of just issuing a certificate, you also need to force a user to create a TXT record with some validation token. Of course, automating the whole process will become a challenge — you need to get the information about hash/token from cert-manager, display it to the user, force them to create a TXT record and make all possible checks.

HTTP-01 validation mechanism should make the process much less complicated — because in this case you just need to put a token file on your server (http://telescope.ac/.well-known/acme-challenge/<TOKEN>), and this file will automatically become accessible via http://custom-domain.co/.well-known/acme-challenge/<TOKEN> if the user created the correct CNAME record. Validation will be passed in seconds.

I spent literally all day trying to figure out what's wrong and I then accidentally found this: https://github.com/kubernetes/ingress-nginx/issues/3996

Ok, seems like there is a problem with DigitalOcean — you cannot access your host within the cluster if Proxy Protocol is enabled.

What's Proxy Protocol?

Proxy Protocol is an Internet protocol used to carry connection information from the source requesting the connection to the destination for which the connection was requested.

Why do I need it? Because I want to know your real IP instead of the IP of my LoadBalancer. And I need it to show you the beautiful stats showing where your users come from (IP allows me to get the information about the country and also calculate the number of unique visitors).

In theory, I can disable proxy protocol and add this line to my LoadBalancer configuration:

externalTrafficPolicy: Local

However, it seems like this feature is also not supported by DigitalOcean.

The solution is...

...to move from DigitalOcean Kubernetes to Amazon Elastic Kubernetes Service! That's exactly what I did. Surprisingly, it took me only four hours (and it was my first ever experience with AWS). Of course, you can say something like "Well, it's Kubernetes and Docker, everything should work the same way everywhere, and if you have a nicely described build process then it can be done even faster."

To be honest, I was scared. AWS looked too overcomplicated at first, but in fact it's not that different from DigitalOcean. Moreover, it seems to be more flexible, and now I think that I should've used it once I started working on this project.

Don't get me wrong — I am not trying to say that there is something wrong with DO. They offer incredible services. But it seems like their target audience is different — at least their Kubernetes solution looks much less "professional" than the one offered by Amazon. Maybe I'm wrong, and I will for sure continue using DO for my other services, but Telescope is going to be hosted on Amazon since now.

How can I move?

The most important thing first: you should always keep all the YAML definitions of your deployments and services up-to-date. I never apply any patches to any Kubernetes objects and never do any operations without making changes to YAML- and Makefile first. I also keep my README.md file updated.

This is how the contents of Telescope's deployments directory looks:

Yes, I use Docker Compose for development, and I find it much more convenient than always keeping minikube running on your local machine.

And here is the shorten version of my Makefile:

development-deps-up:
docker-compose -f deployments/docker-compose.development.yml up --build

development-deps-down:
docker-compose -f deployments/docker-compose.development.yml down

build-static:
cd static && yarn build && cd ..

build-backend: build-static
docker build --rm -t $(BACKEND_IMAGE_NAME) -f ./deployments/backend/production/Dockerfile .
docker push $(BACKEND_IMAGE_NAME)

deploy-db:
helm install -f deployments/db/helm/helm-values.yaml --name telescope-db stable/postgresql

deploy-backend:
cat deployments/backend/production/backend-deployment.yaml | sed -e "s/__VERSION__/$(BACKEND_VERSION)/g" | kubectl apply -f-
kubectl apply -f deployments/backend/production/backend-service.yaml

deploy-all: deploy-db deploy-backend

Technically, moving from DigitalOcean to Amazon EKS means creating a new cluster and just running make deploy-all. Of course, it's not that simple, and there several more things you have to do.

1. Figure out how AWS roles and users work

This is one of the most confusing things you face when you sign up. AWS uses much more complicated identity and access management mechanism called IAM, and you need to understand it first.

Fortunately, the documentation of AWS is amazing, and you can easily find everything you need. Here are some of the essentials:

- What Is IAM?

- Creating an IAM User in Your AWS Account

It all means that you need to spend some time configuring your cluster and adding some roles to it.

2. Start managing your domains with Amazon Route53

Just because otherwise it will be impossible to create "alias" DNS records.

An alias record is an internal Amazon specific pointer working on a higher level; on technical DNS level it may result as an A or as a CNAME, depending on the situation. The DNS doesn't need to be aware of this internal pointer type nor target, as Route53 only answers with the resulting record.

This mechanism gives you additional flexibility — your domain won't be assigned to an IP address of your LoadBalancer, but will use its alias instead. It means you will be able to change the IP without changing anything in your domain specification.

3. Get this goddamn Proxy Protocol working

That's why I made this move, you 'member?

Amazon has a nice article explaining how to enable Proxy Protocol for your load balancer. However, it doesn't say that you also need to add a special annotation to your Ingress specification:

service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: "*"

By the way, compare it with the process of enabling Proxy Protocol for your DigitalOcean Load Balancer:

Yes, in the case of Amazon you need to read something about roles, policies, and policy types first, while DigitalOcean allows you to enable the proxy protocol in 1 second by just ticking the box.

This difference in one small feature also explains the difference between Amazon and DO in general — so the choice between these two platforms is primarily a choice between flexibility and simplicity. Simplicity also comes at a lower price.

That's it..?

I think the only major disadvantage of EKS is its price. It will definitely cost you more than any other managed Kubernetes service like the one offered by DO or Google. Amazon has an AWS Free Tier programme, but it doesn't include anything related to EKS.

There is a calculation of the price difference made by Ioana Vasi in her beautiful article. She and/or her team compared all the popular cloud providers, and concluded that EKS is the most expensive one that can cost you several times more than DigitalOcean! Amazon EKS also seems to be the only major cloud provider that charges for the master nodes you use.

However, you can still save some money if you need something to send e-mail messages and is ready to switch to Amazon SNS. Or if you use S3. Of course, sometimes it's also worth combining your web service managed by Amazon with services offered by other companies. For example, I still use Digital Ocean Spaces.

But no matter what cloud provider you use, I'm still impressed by how easy the complete change of it can be in 2019 — it can take you only four pretty relaxed hours even if you're a newbie in DevOps like me. God save Kubernetes!