Monitoring and Logging in K☸️s using Prometheus and Grafana

Monitoring and Logging in K☸️s using Prometheus and Grafana

Observability is the ability to understand the internal state of a system based on its external outputs. Prometheus and Grafana are tools used for building dashboards and monitoring applications. This blog focuses on deploying an application and monitoring it using Prometheus and Grafana.

I have made a project of voting-app in Kubernetes using ArgoCD, now in this blog let’s make a Grafana dashboard and collect logs and traces from k8s using Prometheus and monitor the logging and alerts.

If you have not implemented and read that blog, then first checkout this: Blog

Understanding Observability

Observability encompasses three key concepts:

Monitoring: Tracking the performance and health of an application.

Logging: Collecting and analyzing application logs to identify issues.

Tracing: Tracking the flow of requests through an application to understand how they are processed.

What is the purpose of each component?

  • Alerting: Notifying users when an application experiences issues or performance degradation.

  • Monitoring helps understand the overall health and performance of an application by tracking metrics like CPU usage, memory consumption, and network traffic.

  • Logging provides insights into application behavior by capturing events, errors, and warnings.

  • Tracing helps pinpoint the root cause of issues by tracking the path of requests through the application.

Tools and Technologies

  • Prometheus: A time-series database and monitoring system used for collecting and storing metrics.

  • Grafana: A visualization and dashboarding tool used for creating interactive dashboards to display Prometheus metrics.

  • Kubernetes: An open-source container orchestration platform used for deploying and managing containerized applications.

  • Docker: A containerization platform used for packaging and running applications in isolated environments.

  • AWS (Amazon Web Services): A cloud computing platform used for hosting and managing applications.

I have completed the project till that stage where we have deployed our app on argocd and the voting application is accessible at port 5000 and result app is accessible at port 5001 and that we are observing our Kubernetes cluster using Kubernetes dashboards.

Voting-Application

Result-Application

Kubernetes Dashboards

ArgoCD Dashboard

Observability in Action

  • Visualization: Observability data is visualized using dashboards, which provide a graphical representation of metrics, logs, and traces.

  • Dashboards: Display key metrics like CPU usage, memory consumption, and network traffic.

  • Prometheus: Used for collecting and storing metrics.

  • Grafana: Used for creating dashboards to visualize Prometheus metrics.

  • GitHub Repository: The observability system code is available on GitHub, providing a complete guide for setting up and configuring the system.

Prometheus

Prometheus is a time-series database (TSDB) used for monitoring Kubernetes clusters.

  • Time-Series Database: Stores data points over time, allowing for trend analysis and visualization.

  • Scraping: Prometheus periodically collects data from the cluster's components.

  • Querying: Prometheus allows you to query the collected data to generate graphs and insights.

Helm: A Package Manager for Kubernetes

For installing prometheus and grafana manifests file, we use Helm: a package manager

  • Helm is a package manager for Kubernetes. It simplifies the process of installing, managing, and deploying applications on Kubernetes clusters.

  • Helm uses manifest files to define the configuration and dependencies of applications. These files are used to install, manage, and delete applications and repositories.

  • Helm allows you to manage a wide range of repositories and applications that are deployed on Kubernetes.

  • Manifests: Helm uses manifests to define the configuration and dependencies of applications.

  • Package: A collection of manifests and configuration files that represent a complete application.

Understanding Helm

  • Helm Charts: Packages that contain the manifests and configuration for a specific application.

  • Helm Repository: A central location where Helm charts are stored and managed.

Install Helm

To install Helm, visit the official Helm website or use the following commands:

curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3
chmod 700 get_helm.sh
./get_helm.sh
rm -rf get_helm.sh

Check Helm Version:

helm version

Create a Namespace

Create a namespace for your monitoring tools:

kubectl create namespace monitoring

Add Prometheus Community Helm Chart and Stable Repo

Add the Prometheus community Helm chart repository:

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add stable https://charts.helm.sh/stable

Visit the Prometheus community GitHub repository for more details.

Update Helm Repositories

Check the list of Helm repositories and update them:

helm repo list
helm repo update

Install the Prometheus Stack

Install the Prometheus stack using Helm with the following command:

helm install kind-prometheus prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--set prometheus.service.nodePort=30000 \
--set prometheus.service.type=NodePort \
--set grafana.service.nodePort=31000 \
--set grafana.service.type=NodePort \
--set alertmanager.service.nodePort=32000 \
--set alertmanager.service.type=NodePort \
--set prometheus-node-exporter.service.nodePort=32001 \
--set prometheus-node-exporter.service.type=NodePort

Check Running Pods

Verify the pods running in the monitoring namespace:

kubectl get pods -n monitoring

Check Services on namespace monitoring

Verify the services running in the monitoring namespace:

kubectl get svc -n monitoring

Port-Forwarding for Prometheus

Set up port-forwarding for Prometheus to access it locally:

kubectl port-forward svc/kind-prometheus-kube-prome-prometheus -n monitoring 9090:9090 --address=0.0.0.0 &

Open your browser and navigate to http://public-ip:9090 to access Prometheus. Ensure you have opened this port on the inbound rules of your security group.

Kubernetes Endpoints Health

To see if Kubernetes is sending data to Prometheus or not, Go to StatusTarget Health

You will find all your kubernetes endpoints are connected or not and what are their health?

  • Target Discovery: Prometheus automatically discovers targets (services or applications) to monitor using service discovery mechanisms like Kubernetes service discovery.

  • Metrics Collection: Prometheus scrapes metrics from targets using a defined scraping interval.

  • Endpoint: The endpoint on a target that exposes metrics for Prometheus to scrape.

To see the metrics getting from Kubernetes: https://public-ip:9090/metrics

PROM-QL

  • Prometheus Query Language (PromQL): A powerful query language used to analyze and visualize time-series data stored in Prometheus.

  • Data Storage: Prometheus stores collected metrics in a time-series database, allowing for historical analysis and trend identification.

  • Querying Data: PromQL queries are used to retrieve and analyze data from the Prometheus database.

PROM-QLs

1: sum (rate (container_cpu_usage_seconds_total{namespace="default"}[1m])) / sum (machine_cpu_cores) * 100

  • Container CPU Usage: rate(container_cpu_usage_seconds_total{namespace="default"}[1m])

    • rate(): Calculates the rate of change of a metric over a specified time interval.

    • container_cpu_usage_seconds_total: Metric representing the total CPU usage of a container.

    • namespace="default": Filters the metric to only include containers in the default namespace.

    • [1m]: Specifies a 1-minute time interval for the rate calculation.

Numeric Representation:

Graphical Representation

2: sum(rate(container_network_receive_bytes_total{namespace="default"}[5m])) by (pod)

  • Network Traffic: sum(rate(container_network_receive_bytes_total{namespace="default"}[1m]))

    • sum(): Aggregates the values of a metric across all matching time series.

    • container_network_receive_bytes_total: Metric representing the total number of bytes received by a container.

    • namespace="default": Filters the metric to only include containers in the default namespace.

    • [1m]: Specifies a 1-minute time interval for the rate calculation.

    • by (pod): Show details of each pod’s network

Grafana for Dashboarding

Grafana is a powerful open-source tool used for visualizing and monitoring data from various sources. It allows users to create interactive dashboards, set alerts, and explore data trends. This guide will cover the basics of Grafana, including user management, data source connections, and dashboard creation.

Our grafana is already installed by HELM, check by kubectl get svc -n monitoring

Port-forward the Grafana Service in monitoring namespace:

kubectl port-forward -n monitoring svc/kind-prometheus-grafana 3000:80 --address 0.0.0.0 &

Access the Grafana by https://public-ip:3000

The initial username = admin and password = prom-operator, if you want to verify this password then run the below command:

kubectl get secret -n monitoring kind-prometheus-grafana -o jsonpath="{.data.admin-password}" | base64 -d && echo

User Management

Creating Users: To create a new user, navigate to the Administration section of Grafana. Click on the Users tab and then click the New User button.

  • User Details: Provide the following information for the new user:

    • Name: The user's name.

    • Email: The user's email address.

    • Username: The user's username.

    • Password: The user's password.

  • User Roles: Grafana offers different roles for users, each with specific permissions:

    • Viewer: Can only view dashboards.

    • Editor: Can view and edit dashboards.

    • Admin: Has full access to Grafana, including user management and data source configuration.

  • Sharing Access: You can share access to your dashboards with other users by changing their roles.

  • Change to viewer for this demo.

Login to Grafana again with the new created user.

Data Source Connections

Data Sources: Grafana connects to various data sources to retrieve data for visualization.

  • Supported Data Sources: Grafana supports a wide range of data sources, including:

    • AmazonFlex.in: A delivery service platform.

    • AWS: Amazon Web Services.

    • Prometheus: An open-source monitoring system. (35:29 - 35:43)

  • Pre-configured Data Sources: If you have deployed Grafana using Helm, you might already have pre-configured data sources like Prometheus and Alertmanager.

    Check by going to ConnectionsData Sources

    Dashboard Creation

    • Building a Dashboard: To create a new dashboard, click on the Build a Dashboard button.

    • Adding Visualizations: Click on the Add Visualization button to add visualizations to your dashboard.

    • Selecting Data Sources: Choose the data source you want to use for your visualization.

    • Querying Data: Use the query builder to retrieve data from your chosen data source.

      Choose Metrics Explorer:

      Labels filters

    • Namespace: kube-system → Run Queries

Now let’s customize it - Click on time series on right hand side → Choose any style and set time for last 5 mins.

Check more data my customizing the filters

Metrics: container_network_receive_bytes_total

Labels: namespace = monitoring → run query

Importing Dashboards

  • Finding Dashboards: Search for pre-built dashboards online on google by grafana dashboards.

  • Search for Kubernetes dashboards.

  • Copy the ID of the dashboard you want to import.

  • Paste the copied ID into the Grafana Dashboard section and click Load.

  • Starting Data Source: Ensure the data source for the dashboard is running.

  • Importing the Dashboard: Click Import to import the dashboard.

Congratulations! our dashboard is ready now ——>

That's it! You've successfully set up Prometheus and Grafana using Helm. Start monitoring your Kubernetes cluster efficiently and make data-driven decisions. If you found this guide helpful, share it with your network or leave a comment below!