Kubernetes

Supported Versions

  • 1.15.x
  • 1.14.x
  • 1.13.x
  • 1.12.x
  • 1.11.x
  • 1.10.x
  • 1.9.x

Supported Managed Kubernetes

  • Amazon Elastic Container Service for Kubernetes (EKS): 1.10.x
  • Azure Kubernetes Service (AKS): 1.10.x
  • Google Kubernetes Engine (GKE): 1.10.x, 1.11.x, 1.12.x
  • Pivotal Container Service (PKS): 1.5 and above [Technical preview]

Supported Service Meshes

  • Istio: 1.2.x, 1.3.x.

Deprecated

Installing the Instana Agent in Kubernetes

The Agent Setup for Kubernetes describes how to install the Instana agent into your cluster.

The installation of Instana agents on PKS is fully automated by the Instana Microservices Application Monitoring for Pivotal Platform tile.

Accessing Kubernetes Information

Once the agent has been deployed to your cluster, the Kubernetes sensor will report detailed data about the cluster and the resources deployed into it.

Instana automatically discovers and monitors Kubernetes:

  • Clusters
  • Nodes
  • Namespaces
  • Deployments
  • Services
  • Pods

Kubernetes information is easily accessible and deeply integrated in all aspects of your application.

Kubernetes Menu Item

First and foremost Kubernetes is a top level item in the menu. This gives you direct access to your Kubernetes clusters and namespaces.

From Application Perspectives

Kubernetes information is also accessible from within all your application perspectives or services. If a service is running on a Kubernetes cluster, the respective context information is shown in the “Infrastructure” tab:

AP Infra Tab

For containers the pod and namespace, and for hosts the cluster and node are shown and directly linked.

From Infrastructure

In the Infrastructure map you will see Kubernetes information in the sidebar for either the host or the container you have selected.

AP Infra Tab

You can use Dynamic Focus to filter the data. For example search for a specific deployment in a cluster. Additionally, the keywords entity.kubernetes.cluster.distribution and entity.kubernetes.cluster.managedBy enable searching for a Kubernetes cluster by distribution and management layer. Supported values for entity.kubernetes.cluster.distribution are gke, eks, openshift and kubernetes. Supported values for entity.kubernetes.cluster.managedBy are rancher and none.

Kubernetes Dashboards

Kubernetes dashboards present all information needed for a given Kubernetes entity. The context is always accessible via the context path at the top. In the following screenshot we are seeing a Namespace named “robot-shop” in a cluster called “will-k8s-cluster”.

Dashboard overview

The different dashboards are always structured in the same way:

  • Summary shows the most relevant information for a given entity. This starts with a status line which shows the current status and related information like age. In the next section CPU, Memory, and Pod information are shown. This gives you information of the consumed resources including the pods. Sections below like “Top Deployments” and “Top Pods” in this screenshot show potential hotspots which you might want to have a look at.

  • Details shows detailed information like “labels”, “annotation”, and the “spec”.

  • Events shows all relevant Kubernetes events and links them to the respective dashboards.

  • Related Entities like “Deployments”, “K8s Services” and “Pods” are shown in the following tabs. What is shown depends on the entity you have selected.

CPU and Memory Usage

For Kubernetes pods, deployments, services, namespaces and nodes it is possible to see an aggregated view of the current CPU and Memory usage as it compares to the CPU and Memory limits and requests set for these resources.

If available, the usage information is calculated from data gathered from the container runtime that is executing the containers that make up the given resource.

Analyze Kubernetes Calls

Analyze calls and traces gives you powerful tools to slice and dice every single call in your Kubernetes cluster. If you click the button “Analyze Calls” from a Kubernetes dashboard the appropriate filter and grouping is already set. In this case we are seeing all calls in the “robot-shop” namespace grouped by pods:

Dashboard overview

Linking Kubernetes Services and Logical Services

Single Kubernetes service to multiple logical services

Multiple logical services can be related to a single Kubernetes service when the service mapping rules match up, and there are calls generated on that Kubernetes service. For example, a Kubernetes service with the label selector "service=my-service" may contain pods that have the additional labels "env=dev" and "env=staging" — combined with a custom service mapping configuration in Instana with the following tags kubernetes.container.name and kubernetes.pod.label, key: env, results in multiple logical services linked to that single Kubernetes service and is displayed on the Kubernetes Service dashboard.

Single logical service to multiple Kubernetes services

Multiple Kubernetes services can be related to a single logical service when those Kubernetes services are destroyed and recreated over time. For example, if the Kubernetes service shop-service-a with generated calls is replaced over time with shop-service-b with generated calls, both services are displayed on the logical service dashboard when the selected period of time overlapped when the calls were generated.

Sensor Data Collection

Instana collects information about the Kubernetes Cluster, Nodes, Namespaces, Deployments, K8s Services, and Pods.

  • Cluster

    • KPIs
    • Node Count
    • Pods Allocation (Allocated Pods / Pods Capacity ratio)
    • CPU Requests Allocation (CPU Requests / CPU Capacity ratio)
    • CPU Limits Allocation (CPU Limits / CPU Capacity ratio)
    • Memory Requests Allocation (Memory Requests / Memory Capacity ratio)
    • Memory Limits Allocation (Memory Limits / Memory Capacity ratio)
    • CPU Resources
    • CPU Requests (aggregated cpu requests of all running containers)
    • CPU Limits (aggregated cpu limits of all running containers)
    • CPU Capacity (aggregated cpu capacity of all nodes)
    • Memory Resources
    • Memory Requests (aggregated memory requests of all running containers)
    • Memory Limits (aggregated memory limits of all running containers)
    • Memory Capacity (aggregated memory capacity of all nodes)
    • Pods (aggregated on whole cluster)
    • Running Pods
    • Pending Pods
    • Allocated Pods
    • Pods Capacity
    • Replicas (aggregated from all deployments)
    • Available Replicas
    • Desired Replicas
    • Node list with KPIs
    • Deployment list with KPIs
    • Component Statuses
  • Node

    • KPIs
    • Pods Allocation (Allocated Pods / Pods Capacity ratio)
    • CPU Requests Allocation (CPU Requests / CPU Capacity ratio)
    • CPU Limits Allocation (CPU Limits / CPU Capacity ratio)
    • Memory Requests Allocation (Memory Requests / Memory Capacity ratio)
    • Memory Limits Allocation (Memory Limits / Memory Capacity ratio)
    • CPU Resources
    • CPU Requests (aggregated cpu requests of all running containers on this node)
    • CPU Limits (aggregated cpu limits of all running containers on this node)
    • CPU Capacity
    • Memory Resources
    • Memory Requests (aggregated memory requests of all running containers on this node)
    • Memory Limits (aggregated memory limits of all running containers on this node)
    • Memory Capacity
    • Pods Allocation
    • Allocated Pods (running pods on this node)
    • Pods Capacity
    • Conditions
    • Labels
    • Pods list
  • Namespace

    • KPIs
    • CPU Requests Allocation (CPU Requests / CPU Capacity ratio)
    • CPU Limits Allocation (CPU Limits / CPU Capacity ratio)
    • Memory Requests Allocation (Memory Requests / Memory Capacity ratio)
    • Memory Limits Allocation (Memory Limits / Memory Capacity ratio)
    • Pods Allocation (Allocated Pods / Pods Capacity ratio)
    • Status
    • Deployments list
    • Deployment configs list
  • ResourceQuota

    • Hard & Used
    • CPU Requests
    • CPU Limits
    • Memory Requests
    • Memory Limits
    • Pods
  • Deployment

    • Conditions
    • Labels
    • CPU Resources
    • CPU Requests (aggregated cpu requests of all running containers of this deployment)
    • CPU Limits (aggregated cpu limits of all running containers of this deployment)
    • Memory Resources
    • Memory Requests (aggregated memory requests of all running containers of this deployment)
    • Memory Limits (aggregated memory limits of all running containers of this deployment)
    • Pods
    • Available vs Desired Pods
    • Pending vs Unscheduled vs Unready Pods
    • Pending phase duration (in most cases can be interpreted as rollout duration)
  • K8s Service

    • Type
    • Location
    • Cluster IP & External IP
    • CPU Requests, Limits
    • Memory Requests, Limits
    • Endpoints List
    • Ports List
  • Pod

    • KPIs
    • Phase
    • Restarts (aggregated on all containers of this pod)
    • CPU Requests (aggregated on all containers of this pod)
    • CPU Limits (aggregated on all containers of this pod)
    • Memory Requests (aggregated on all containers of this pod)
    • Memory Limits (aggregated on all containers of this pod)
    • Conditions
    • Labels
    • Container list (State, Restarts)

Health Rules

Built-in

There are a couple of built-in health rules that will trigger an issue for Kubernetes entities

  • Cluster

    • Kubernetes reports a Master-Component (api-server, scheduler, controller manager) is unhealthy. Note that due to a bug in Kubernetes the health is not always reported reliably. We try to filter these out, not causing an alert but only showing up on the Cluster detail page.
  • Node

    • Requested CPU is approaching max capacity (requested CPU / CPU capacity ratio is greater than 80%).
    • Requested Memory is approaching max capacity (requested memory / memory capacity ratio is greater than 80%).
    • Allocated pods are approaching maximum capacity (allocated pods / pods capacity ratio is greater than 80%). For a node pods in the phases ‘Running’ and ‘Unknown’ are counted as allocated. See Kubernetes docs for details on node capacity.
    • Node reports a condition which is not ready for more than one minute. For a node that’s all conditions besides the Ready condition. See Kubernetes docs for details on all node conditions.
  • Namespace

    • Requested CPU is approaching max capacity (requested CPU / CPU capacity ratio is greater than 80%).
    • Requested Memory is approaching max capacity (requested memory / memory capacity ratio is greater than 80%)
    • Allocated pods are approaching maximum capacity (allocated pods / pods capacity ratio is greater than 80%). For a namespace pods in the phases ‘Pending’, ‘Running’, and ‘Unknown’ are counted as allocated. The namespace capacity values are based on ResourceQuotas which can be set per Namespace. See Kubernetes docs for details.
  • Deployment

    • Available replicas less than desired replicas.
  • Pod

    • A pod is not ready for more than one minute, and the reason is not that it’s completed. (PodCondition=Ready, Status=False, Reason != PodCompleted). See Kubernetes docs for details on all pod conditions.

Custom

In addition to the built-in rules, you can also create custom rules on metrics of a cluster, namespace, deployment, and pod. E.g. if the threshold for node capacity warnings is too high you can disable them and create a custom rule with a lower threshold. See Events & Incidents configuration for details.

Service Meshes

Istio

The default installation should work out of the box with Instana. If however you deploy Istio with a default deny policy (mode: REGISTRY_ONLY). To work effectively with this configuration it is necessary to enable Instana’s service mesh bypass. This can be enabled with the following agent configuration:

serviceMesh:
  enableServiceMeshBypass: true

Notes

Using a GKE provided containerd node image

Instana does currently not support monitoring GKE provided containerd based images (cos_containerd or ubuntu_containerd).