trivago tech tips - #20 Kubernetes Edition

Here are our five tips for working with Kubernetes!

May 10, 2021

At trivago we have started moving our user-facing services to Google Kubernetes Engine in 2018. Today we run dozens of multi-tenant clusters, hosting thousands of pods in production, all tied together with an istio based service mesh.

Here are our five tips for working with Kubernetes!

#1 Kubernetes contexts supercharged

As an SRE you often have to switch between a lot of Kubernetes clusters and namespaces. It is very easy to loose track of all the different credentials, cluster names and namespaces, plus things get even worse if you use multiple terminal windows or tabs. Without additional tooling you quickly find yourself grep’ing your kube-config and re-checking the active context all the time.

To address the problem of too many contexts, many people install the kubectx toolset. It allows you to list, rename and switch to different Kubernetes contexts or namespaces easily. If you also install another tool called fzf, you even get to use a fancy command-line selector.

The downside here: you still can only have one Kubernetes context active at a time. Tools like kube-ps1 can help you to keep track of where you are right now, so that you don’t accidentally work on the wrong cluster. However, there is still room for errors and you might find yourself accidentally deploying something on production that was actually meant for stage, just because you chose the wrong terminal window or tab.

A very elegant way to fix this is to use the environment variable KUBECONFIG in combination with the direnv tool:

Create one kube-config file per cluster and place them in different folders. Use direnv to activate the file when switching to that folder. Now the folder you are in defines which cluster is being used. And that even works between different terminal tabs or windows.

mkdir myCluster && cd myCluster

echo 'export KUBECONFIG=.kubeconfig' > .envrc
direnv allow .

# For GKE use this command, which will use the KUBECONFIG variable
gcloud container clusters get-credentials myCluster

# Try it!
cd ..
kubectl config view
cd -
kubectl config view

#2 & #3 Using the kubectl crowbar

In our day-to-day work as SREs, we often face the situation where Kubernetes workloads are misbehaving in some way or another. A well-known brute-force tactic used to solve many problems is the fabled “have you tried to turn it off and on again?”.

Sine Kubernetes 1.15 you can do this via the “kubectl rollout restart” command. Before that, you could achieve the same by using “kubectl patch”.

krd() { 
  # Kubernetes >= 1.15

  kubectl rollout restart deployment -n "${1}" "${2}"

}

krd() {

# Kubernetes < 1.15

  kubectl patch deployment -n "${1}" "${2}" \

  -p '{"spec":{"template":{"metadata":{"labels":{"date":"'"$(date +'%s')"'"}}}}}'

}

To use this function, you pass the namespace and name of the workload you want to restart to this function. In the pre-Kubernetes 1.15 version it will inject a new pod label with the current timestamp.

The effect? You do a new “release” that is exactly like the old one. Or in other words, you restart all pods with the same deployment strategy that is used during “normal” releases including readiness probes, the option for rollback, and so on. Pretty neat.

Doing the same for cronjobs is a little trickier. Sometimes you want to restart a failed cronjob right now and not wait for, let’s say, another day. The command “kubectl create” can help here:

ktc() {
  kubectl create job -n "$1" "${2}-${RANDOM}" --from="cronjob/${2}"

}

Like before you simply pass the namespace and name of the cronjob to trigger. The function will create a new job for you that is based upon the cronjob’s spec. Using the RANDOM variable prevents you from running into any name collisions.

#4 Debugging low-level Kubernetes network issues

Service meshes are great. They give you a lot of metrics and logs on your traffic flow, which you can use for debugging all kinds of difficult problems. But what do you do when there is no connection? Or no metrics? Or no logs? And you cannot find the issue? That’s the point where we usually use wireshark to get down to low level TCP/IP traffic debugging.

But how do you do this inside a Kubernetes cluster with tens or even hundreds of workloads on a single node? You can use ksniff!

The tool ksniff injects a container into your cluster that attaches tcpdump to exactly the pod you want to debug. And even better: it transfers the captured frames to a wireshark instance running on your laptop in realtime. Now you can see what really happens to your packages.

#5 Using Helm for multi-cluster deploys

Helm is a great tool to organise your Kubernetes manifests or install ready-made third-party software. But one issue that we faced was the rather static nature of helm’s “values.yaml” file. If you have to deal with different groups of clusters (like frontend and backend) in multiple tiers (like dev, stage, prod) and multiple different regions on top, you often want to configure charts slightly differently depending on where you want to deploy to.

For our central components and top-level namespace management, we do this by passing multiple value files to helm in our manifest rendering pipeline. For this we introduced a folder called “config” with various subdirectories and files, allowing us to pass different override values for each chart.

The first file read by helm is always the values.yaml of the chart we want to render. As of this, the values.yaml is used for defining global defaults.

We then pass in a “global.yaml” containing values that are used per “cluster-group”, e.g. the backend clusters. This file often contains only the special “global” section.

Next, we pass in a file specific for the tier and region of each cluster, e.g. “stage/europe”, followed by the same for a specific chart. While the former file is “per-cluster”, i.e. a more specialised version of the global.yaml, the latter file can be used to override values for a specific chart.

As the files are being merged in the order they are passed to helm, we can add or override values at each step and even “share” settings between multiple charts by using the “global” section.

helm template -f config/global.yaml -f config/tier/region.yaml -f myChart/config/tier/region.yaml myChart  

# Files to the right extends/overrides files to the left 
# chart/values.yaml <- global.yaml <- tier/region.yaml <- chart/tier/region.yaml

The resulting manifests can then be applied either manually or through a GitOps tool like ArgoCD.

To make it more transparent to users which files are actually being used by a chart, we use a tool called helm-templexer. This tool reads a simple configuration file placed in each chart, which also covers other aspects of the template command like the release and namespace name.

Bonus: If you have read our first Kubernetes tech tip on direnv, you now also have found a good place to put your kube-configs. Just make sure you don’t commit them ;)

That’s a wrap on our Kubernetes tips! Do you have any to share with us? Drop them below!

trivago tech newsletter

trivago tech tips - #20 Kubernetes Edition

Here are our five tips for working with Kubernetes!

#1 Kubernetes contexts supercharged

#2 & #3 Using the kubectl crowbar

#4 Debugging low-level Kubernetes network issues

#5 Using Helm for multi-cluster deploys