Anatomy of Kubeshark

Distributed packet capture with minimal footprint, built for large scale production clusters.

Anatomy of Kubeshark

Kubeshark supports two main deployment use-cases:

  1. On-demand lightweight traffic investigation using a CLI, by anyone with kubectl access.
  2. Long living deployment, using a helm chart, in support of multiple use-cases (e.g. collaborative debugging, network monitoring, telemetry and forensics).

Kubeshark requires no prerequisites like: CNI, service-mesh or coding. It doesn’t use a proxy or a sidecar and doesn’t require architecture alterations to function. The CLI option can get your K8s traffic investigation going in only a few minutes.

Kubeshark consists of four software components that work together harmoniously:

CLI

The CLI is a binary distribution of the Kubeshark client and it is written in Go language. It is an optional component that offers a lightweight on-demand option to use Kubeshark that doesn’t leave any permanent footprint. It communicates directly with Kubernetes API to deploy the right containers at the right place at the right time.

Here are a few examples how you can use the Kubeshark CLI to start capturing traffic in your K8s cluster:

kubeshark tap

kubeshark tap                                       - tap all pods in all namespaces
kubeshark tap -n sock-shop "(catalo*|front-end*)"   - tap only pods that match the regex in a certain namespace
kubeshark tap --proxy-host 0.0.0.0                  - make the dashboard port accessible from outside localhost

For more options on how to use the tap command, refer to the tap command section.

Additional kubeshark commands

kubeshark proxy                                       - re-establish a connection to the dashboard
kubeshark clean                                       - clean all kubeshark resources

Source code: kubeshark/kubeshark

The Dashboard

Kubeshark’s dashboard is a React app that communicates with the Hub via WebSocket and displays the captured traffic in a scrolling feed.

Kubeshark UI

Source code: kubeshark/front

Pod name: kubeshark-front

NOTE: Read more in the dashboard section.

Hub

The Hub is a pod that acts as a gateway to the Workers. It hosts an HTTP server and serves to these purposes:

  • Accepts WebSocket connections and accompanying filter.
  • Establishes new WebSocket connections to the workers.
  • Receives the dissected traffic from the workers.
  • Streams the results back to the requester.
  • Configure worker states through HTTP calls.

Source code: kubeshark/hub

Pod name: kubeshark-hub

Worker

It’s deployed into your cluster as a DaemonSet to ensure each node in your cluster are covered by Kubeshark.

The worker contains the implementations of network sniffer and kernel tracer. It captures the packets from all network interfaces, reassembles the TCP streams and if they are dissectable then stores them as PCAP files. Workers transmit the collected traffic to Hub via WebSocket connections.

Kubeshark stores raw packets and dissects them on demand upon filtering.

The worker by itself can be used as a network sniffer on your computer without requiring a Kubernetes cluster.

Source code: kubeshark/worker

Pod name: kubeshark-worker-daemon-set-<id>

Distributed PCAP-based Storage

Kubeshark uses a distributed PCAP-based storage where each of the Workers store the captured L4 streams in the root file system of the node.

Low Network Overhead

To reduce potential network overhead, only a fraction of the traffic is sent over the network upon request.