What are the key Kubernetes metrics that you have to monitor ?

kubernetes

We have already looked at BEST Kubernetes monitoring tools, with the increasing adoption of containers and microservices in the enterprises, monitoring utilities have to handle more services and server instances than ever before. Kubernetes environments vary from deployment to deployment, but they generally have a handful of key components, resources, and potential errors in common. Currently, the Kubernetes ecosystem provides two add-ons for aggregating and reporting monitoring data from your cluster: (1) Metrics Server and (2) kube-state-metrics.

Metrics Server is a cluster add-on that collects resource usage data from each node and provides aggregated metrics through the Metrics API. Metrics Server makes resource metrics such as CPU and memory available for users to query, as well as for the Kubernetes Horizontal Pod Autoscaler to use for auto-scaling workloads.

In addition to monitoring the CPU and memory usage of cluster nodes and pods, you will also need a way to collect metrics tracking the high-level status of the cluster and its constituent objects. Kubernetes API server exposes data about the count, health, and availability of pods, nodes, and other Kubernetes objects. By installing the kube-state-metrics add-on in your cluster, you can consume these metrics to detect and resolve issues with cluster infrastructure, resource constraints, or pod scheduling.

kube-state-metrics service provides additional cluster information that Metrics Server does not. Metrics Server exposes statistics about the resource utilization of Kubernetes objects, whereas kube-state-metrics listens to the Kubernetes API and generates metrics about the state of Kubernetes objects: node status, node capacity (CPU and memory), number of desired/available/unavailable/updated replicas per Deployment, pod status (e.g., waiting, running, ready), and so on.

In this post, we are going to look at what are the key metrics and alerts that are required to monitor your Kubernetes cluster.

At a high level, below are the key metrics to monitor

Cluster state metrics
Resource metrics
Control plane metrics
Kubernetes events

Key Kubernetes metrics to monitor

What to monitor?	Metrics to monitor	Alert Criteria
Cluster state	Monitor the aggregated resources usage across all nodes in your cluster. Node status Desired pods Current pods Available pods Unavailable pods	Node status Desired vs. current pods Available and unavailable pods
Node resources	For each of the node monitor : Memory requests Memory limits Allocatable memory Memory utilization CPU requests CPU limits Allocatable CPU CPU utilization Disk utilization	If the node’s CPU or memory usage drops below a desired threshold. Memory limits per pod vs. memory utilization per pod Memory utilization Memory requests per node vs. allocatable memory per node Disk utilization CPU requests per node vs. allocatable CPU per node CPU limits per pod vs. CPU utilization per pod CPU utilization
Missing pod	Health and availability of your pod deployments. Available pods Unavailable pods	If the number of available pods for a deployment falls below the number of pods you specified when you created the deployment.
Pods that are not running	If a pod isn’t running or even scheduled, there could be an issue with either the pod or the cluster, or with your entire Kubernetes deployment. Pod status	Alerts should be based on the status of your pods (“Failed,” ”Pending,” or “Unknown” for the period of time you specify)
Container restarts	Container restarts could happen when you’re hitting a memory limit (ex.Out of Memory kills) in your containers. Also, there could be an issue with either the container itself or its host.	Kubernetes automatically restarts containers, but setting up an alert will give you an immediate notification later you can analyze and set the proper limits
Container resource usage	Monitor container resource usage for containers in case you’re hitting resource limits, spikes in resource consumption,	Alerts to check if container CPU and memory usage and on limits are based on thresholds.
Storage volumes	Monitor storage to Ensure your application has enough disk space so pods don’t run out of space. Volume usage and adjust either the amount of data generated by the application or the size of the volume according to usage.	Alerts to check if available bytes, capacity crosses your thresholds. Identify persistent volumes and apply a different alert threshold or notification for these volumes, which likely hold important application data.
Control Plane – Etcd	Monitor etcd for the below parameters: Leader existence and change rate Committed, applied, pending, and failed proposals. gRPC performance.	Alerts to check if any pending or failed proposals or reach inappropriate thresholds.
Control Plane – API Server	Monitor the API server for below parameters : Rate / number of HTTP requests Rate/number of apiserver requests	Alerts to check if the rate or number of HTTP requests crosses a desired threshold.
Control Plane – Scheduler	Monitor the scheduler for the below parameters Rate, number, and latency of HTTP requests. Scheduling latency. Scheduling attempts by result. End-to-end scheduling latency (sum of scheduling).	Alerts to check if the rate or number of HTTP requests crosses a desired threshold.
Control Plane – Controller Manager	Monitor the scheduler for the below parameters: Work queue depth Number of retries handled by the work queue	Alerts to check if requests to the work queue exceed a maximum threshold.
Kubernetes events	Collecting events from Kubernetes and from the container engine (such as Docker) allows you to see how pod creation, destruction, starting, or stopping affects the performance of your infrastructure.	Any failure or exception should need to be alerted.

I hope, I have covered key metrics and alerts that are required to monitor your Kubernetes cluster. Also If I have missed out on any of the key metrics, do let me know.

Like this post? Don’t forget to share it!

Useful Resources

Summary

Article Name

What are the key Kubernetes metrics to monitor ?

Description

In this post, we are going to look at what are the key metrics and alerts that are required to monitor your Kubernetes cluster.

Author

Karthik

Publisher Name

Upnxtblog

Publisher Logo

Karthik

Allo! My name is Karthik,experienced IT professional.Upnxtblog covers key technology trends that impacts technology industry.This includes Cloud computing,Blockchain,Machine learning & AI,Best mobile apps, Best tools/open source libs etc.,I hope you would love it and you can be sure that each post is fantastic and will be worth your time.

Next 5 BEST VPN Services 2020 »

Previous « Run your local Kubernetes clusters with Kind

Published by

Karthik

Tags: kubernetesmetrics

4 years ago

Unlock the Potential of Java Microservices for Scalable Solutions
In today's rapidly evolving digital landscape, businesses and developers are continuously searching for efficient, scalable…
How You Can Improve Your Business’s Performance with a Kubernetes Ingress Controller
Improving your business is vital to not only its progress but also its survival. There…
Enforcing policies with Kubewarden on Amazon EKS
According to Red Hat's 2022 State of Kubernetes Security Report, respondents stated that exposures due…

How Can I Write Faster with AI Tools?

Writing software backed by artificial intelligence can create everything from emails to blog articles. AI…

3 days ago

Best Tools/Open Source Libs

Strengthening Cyber Defenses: The Benefits of Outsourcing Cybersecurity

In today's interconnected digital realm, cybersecurity stands as a paramount concern for organizations, irrespective of…

4 days ago

Blockchain

Navigating Volatility: Investing in Crypto Derivatives and Risk Management Strategies

The cryptocurrency market is famed for its volatility, presenting each opportunity and demanding situations for…

3 weeks ago

Machine Learning Guides

How Game Developers Use AI in Mobile Games in 2024?

Games since time immemorial have been winning at captivating the users and teleporting them onto…

3 weeks ago

Machine Learning Guides

The Impact of AI on Software Development

We are living within an innovation curve wherein cutting-edge technologies are making a hustle and…

4 weeks ago

Machine Learning Guides

AI Tools for Research Paper Writing: Learn What They Can Do

Whether it’s the healthcare industry or the automobile sector, artificial intelligence has left its impact…

1 month ago

This website uses cookies.

What are the key Kubernetes metrics that you have to monitor ?

Key Kubernetes metrics to monitor

Useful Resources

Recent Posts

How Can I Write Faster with AI Tools?

Strengthening Cyber Defenses: The Benefits of Outsourcing Cybersecurity

Navigating Volatility: Investing in Crypto Derivatives and Risk Management Strategies

How Game Developers Use AI in Mobile Games in 2024?

The Impact of AI on Software Development

AI Tools for Research Paper Writing: Learn What They Can Do

Tag Cloud

What are the key Kubernetes metrics that you have to monitor ?

Key Kubernetes metrics to monitor

Useful Resources

Related Post

Recent Posts

How Can I Write Faster with AI Tools?

Strengthening Cyber Defenses: The Benefits of Outsourcing Cybersecurity

Navigating Volatility: Investing in Crypto Derivatives and Risk Management Strategies

How Game Developers Use AI in Mobile Games in 2024?

The Impact of AI on Software Development

AI Tools for Research Paper Writing: Learn What They Can Do

Tag Cloud