Monitoring via Prometheus & Grafana

Monitoring is essential in Kubernetes to ensure your clusters are healthy and performant. Grafana and Prometheus are the go-to tools for this. While Prometheus collects and stores metrics, Grafana turns them into insightful, customizable dashboards. Together, they make monitoring seamless and effective.

We already installed both Grafana and Prometheus in our cluster in the namespace monitoring and connected them. When checking the pods in k9s (you could also filter the pods in the namespace by running /monitoring) you see there are already some monitoring pods running. Let’s get started and sneak into both tools :)

Grafana Dashboard

Let’s first start a port forwarding by running:

kubectl port-forward svc/monitoring-grafana 3000:80 -n monitoring

When visiting http://localhost:3000, we should now see:

Grafana Homescreen

Seeing it? Great!

In order to log in, we need to get the admin passwort automatically created for us. Let’s run:

kubectl --namespace monitoring get secrets monitoring-grafana -o jsonpath="{.data.admin-password}" | base64 -d ; echo

Log in with username admin and the password we just received. And there we have our Dashboard :)

Grafana Dashboard

There are already some dashboards preconfigured for us, showing metrics related to our cluster, such as CPU and memory usage, Pod health and status, Node health, etc..

Feel free to explore them a little by selecting “Dasboards” in the left sidebar and click through the provided Dashboards. Of course specifically check for your own usage ;)

Prometheus Metrics

Prometheus is the monitoring backend that collects and stores metrics from your cluster. We can also have a look in there by running:

kubectl port-forward svc/monitoring-kube-prometheus-prometheus 9090:9090 -n monitoring

Open http://localhost:9090, so you can see the Prometheus Board

Prometheus Board

where you can query metrics. Try the following queries to see the data collected by Prometheus:

  • up: Shows whether your nodes and services are up.
  • node_cpu_seconds_total: Displays total CPU time consumed by each node.
  • kube_pod_container_status_running: Shows the running status of containers.

Alerts in Grafana

Grafana supports setting up alerts when certain thresholds are crossed (e.g., high CPU or memory usage). So let’s dig into this a little.

Since no notification channels (e.g., email, Slack) are configured, it does not make a lot of sense to create an alert, but we could have a look at the already predefined ones.

Nagivate to Alerting > Alert Rules in the sidebar. Here you can find the already preconfigured alerts. Feel free to sneak a little into them - this gives you a feel for how alerts are defined and how they would behave.

And this is at least for this very start of understanding monitoring in Kubernetes - all. Congratulations! You made it to all of our labs 🎉

There is just a very last thing to do: Cleanup! 🧹