Advanced Kubernetes Interview Questions (2025 Edition)
For senior Kubernetes roles, interviewers expect deep technical knowledge, architectural understanding, and the ability to solve complex problems. This guide covers advanced concepts that demonstrate expertise in Kubernetes internals, troubleshooting, and system design.
Expected Answer: Pod scheduling involves multiple components working together to place Pods on appropriate nodes based on various constraints and requirements.
Scheduling Process:
- Predicates: Hard requirements that must be met (resource availability, node selectors)
- Priorities: Soft requirements that influence node selection (resource distribution, affinity)
- Binding: Final placement decision and Pod-to-node binding
Key Factors:
- Resource Requests/Limits: CPU and memory requirements
- Node Selectors: Hard node affinity rules
- Node Affinity: Soft node preference rules
- Pod Affinity/Anti-affinity: Pod placement relative to other Pods
- Taints and Tolerations: Node isolation and Pod acceptance
- Resource Distribution: Spread Pods across nodes for better resource utilization
Example Configuration:
apiVersion: v1
kind: Pod
metadata:
name: my-pod
spec:
nodeSelector:
disk: ssd
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: zone
operator: In
values:
- us-west-2a
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- database
topologyKey: kubernetes.io/hostname
tolerations:
- key: "dedicated"
operator: "Equal"
value: "database"
effect: "NoSchedule"
Example Response: “The scheduler uses a two-phase approach: predicates and priorities. Predicates are hard requirements like resource availability and node selectors. Priorities are soft requirements that score nodes based on factors like resource distribution and affinity rules. The scheduler also considers taints and tolerations for node isolation, and can use custom schedulers for specialized workloads. For example, you might want database Pods to avoid being scheduled on the same node for high availability.”
Expected Answer: Both patterns extend Pod functionality, but they serve different purposes and have different execution models.
Init Containers:
- Purpose: Setup and initialization tasks that must complete before the main container starts
- Execution: Run sequentially, all must succeed before main container starts
- Lifecycle: Run once during Pod startup, then terminate
- Use Cases: Database migrations, configuration downloads, dependency checks
- Restart Policy: Always restart on failure
Sidecar Containers:
- Purpose: Support and enhance the main application container
- Execution: Run alongside the main container throughout its lifecycle
- Lifecycle: Start with main container and run until Pod termination
- Use Cases: Logging, monitoring, caching, security proxies
- Restart Policy: Follows main container restart policy
Init Container Example:
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
initContainers:
- name: init-db
image: postgres:13
command: ['sh', '-c', 'until pg_isready -h db-service; do echo waiting for database; sleep 2; done;']
- name: init-config
image: busybox
command: ['sh', '-c', 'wget -O /config/app.conf https://config-server/app.conf']
volumeMounts:
- name: config-volume
mountPath: /config
containers:
- name: app
image: my-app:latest
volumeMounts:
- name: config-volume
mountPath: /app/config
volumes:
- name: config-volume
emptyDir: {}
Sidecar Example:
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
containers:
- name: app
image: my-app:latest
ports:
- containerPort: 8080
- name: sidecar
image: fluentd:latest
command: ['fluentd', '-c', '/fluentd/etc/fluent.conf']
volumeMounts:
- name: log-volume
mountPath: /var/log
volumes:
- name: log-volume
emptyDir: {}
Example Response: “Init containers run before your main application starts and are perfect for setup tasks like database migrations or downloading configuration. They run sequentially and all must succeed. Sidecar containers run alongside your main application throughout its lifecycle, handling cross-cutting concerns like logging, monitoring, or caching. For example, you might use an init container to wait for a database to be ready, then use a sidecar container for log aggregation that runs for the entire Pod lifecycle.”
Learn more about Init Containers Learn more about Sidecar Pattern
Expected Answer: Network policies control Pod-to-Pod communication within the cluster, providing fine-grained network security controls.
Network Policy Components:
- Pod Selectors: Define which Pods the policy applies to
- Ingress Rules: Control incoming traffic to Pods
- Egress Rules: Control outgoing traffic from Pods
- Policy Types: Ingress, Egress, or both
- Default Policies: Allow or deny all traffic by default
Implementation Requirements:
- CNI Plugin: Must support NetworkPolicy (e.g., Calico, Cilium, Weave Net)
- Policy Controller: Enforces the policies at the network level
- Namespace Isolation: Can be applied per namespace
Example Network Policy:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: api-network-policy
namespace: production
spec:
podSelector:
matchLabels:
app: api-server
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
- namespaceSelector:
matchLabels:
name: monitoring
ports:
- protocol: TCP
port: 8080
egress:
- to:
- podSelector:
matchLabels:
app: database
ports:
- protocol: TCP
port: 5432
- to:
- namespaceSelector: {}
ports:
- protocol: UDP
port: 53
Default Deny Policy:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
Example Response: “Network policies provide microsegmentation at the Pod level. You can control which Pods can communicate with each other based on labels and namespaces. For example, you might allow only frontend Pods to talk to API Pods, and only API Pods to talk to database Pods. The key is that you need a CNI plugin that supports NetworkPolicy, and you should start with a default-deny policy and explicitly allow required traffic. This follows the principle of least privilege.”
Learn more about Network Policies
Expected Answer: Pod Disruption Budgets (PDBs) ensure application availability during voluntary disruptions like node maintenance, cluster upgrades, or scaling operations.
PDB Concepts:
- Voluntary Disruptions: Planned operations that can cause Pod termination
- Involuntary Disruptions: Unplanned events like node failures
- Min Available: Minimum number of Pods that must be available
- Max Unavailable: Maximum number of Pods that can be unavailable
- Budget Enforcement: Prevents operations that would violate the budget
PDB Configuration:
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: my-app-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: my-app
Alternative Configuration:
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: my-app-pdb
spec:
maxUnavailable: 1
selector:
matchLabels:
app: my-app
Use Cases:
- High Availability: Ensure minimum number of replicas during maintenance
- Rolling Updates: Prevent too many Pods from being unavailable
- Node Draining: Control Pod eviction during node maintenance
- Cluster Scaling: Manage Pod termination during scale-down operations
Example Response: “PDBs protect your applications during planned operations like node maintenance or cluster upgrades. For example, if you have 5 replicas of your application, you might set a PDB to ensure at least 3 are always available. This prevents operations that would leave you with fewer than 3 replicas. PDBs work with the eviction API, so when you drain a node or perform rolling updates, Kubernetes respects the budget and won’t terminate Pods if it would violate the PDB.”
Learn more about Pod Disruption Budgets
Expected Answer: Kubernetes provides a flexible storage abstraction that supports various storage backends and access patterns.
Storage Architecture:
- PersistentVolume (PV): Cluster-wide storage resource
- PersistentVolumeClaim (PVC): User’s request for storage
- StorageClass: Dynamic provisioning configuration
- Volume Plugin: Interface to storage backend
Volume Types:
1. Persistent Volumes:
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-pv
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: fast-ssd
hostPath:
path: /mnt/data
2. Storage Classes:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-ssd
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp3
iops: "3000"
throughput: "125"
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
3. PVC Example:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
accessModes:
- ReadWriteOnce
storageClassName: fast-ssd
resources:
requests:
storage: 10Gi
Volume Access Modes:
- ReadWriteOnce (RWO): Single node read/write
- ReadOnlyMany (ROX): Multiple nodes read-only
- ReadWriteMany (RWM): Multiple nodes read/write
Example Response: “Kubernetes abstracts storage through PVs and PVCs. PVs represent actual storage resources in the cluster, while PVCs are requests for storage by users. StorageClasses enable dynamic provisioning - when you create a PVC, Kubernetes automatically creates a PV that matches your requirements. Access modes determine how the volume can be mounted: RWO for single-node databases, RWM for shared file systems. The key is choosing the right storage class and access mode for your workload.”
Expected Answer: CRDs extend the Kubernetes API to support custom resources, enabling domain-specific abstractions and operators.
CRD Components:
- Custom Resource: The new resource type you’re defining
- Custom Controller: Logic that manages the custom resource
- API Server: Handles CRUD operations for custom resources
- Validation: Schema validation for custom resources
CRD Example:
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: databases.example.com
spec:
group: example.com
versions:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
databaseName:
type: string
replicas:
type: integer
minimum: 1
maximum: 5
storage:
type: object
properties:
size:
type: string
pattern: '^[0-9]+Gi$'
scope: Namespaced
names:
plural: databases
singular: database
kind: Database
shortNames:
- db
Custom Resource Instance:
apiVersion: example.com/v1
kind: Database
metadata:
name: my-database
spec:
databaseName: myapp
replicas: 3
storage:
size: 100Gi
Use Cases:
- Operators: Automate complex application lifecycle management
- Domain-Specific Abstractions: Create resources that match your business domain
- Integration: Bridge Kubernetes with external systems
- Automation: Encode operational knowledge in custom controllers
Example Response: “CRDs let you extend Kubernetes with your own resource types. For example, instead of managing database deployments manually, you could create a Database CRD that represents a database instance. A custom controller would watch for Database resources and automatically create the necessary StatefulSets, Services, and PersistentVolumeClaims. This encapsulates operational knowledge and provides a higher-level abstraction that’s specific to your domain.”
Expected Answer: Security contexts control the security settings for Pods and containers, including user/group IDs, capabilities, and privilege levels.
Security Context Levels:
- Pod Security Context: Applies to all containers in the Pod
- Container Security Context: Applies to specific containers
- Pod Security Standards: Cluster-wide security policies
Security Context Example:
apiVersion: v1
kind: Pod
metadata:
name: secure-pod
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
supplementalGroups: [1000, 2000]
containers:
- name: app
image: my-app:latest
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
seccompProfile:
type: RuntimeDefault
volumeMounts:
- name: tmp
mountPath: /tmp
volumes:
- name: tmp
emptyDir: {}
Pod Security Standards:
apiVersion: v1
kind: Pod
metadata:
name: restricted-pod
labels:
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/warn: restricted
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
containers:
- name: app
image: my-app:latest
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
Security Implications:
- Privilege Escalation: Controls whether containers can gain additional privileges
- Root Access: Running as non-root reduces attack surface
- Capabilities: Fine-grained permission control
- Read-only Filesystem: Prevents runtime modifications
- Seccomp Profiles: System call filtering
Example Response: “Security contexts control how containers run in terms of user identity, capabilities, and privilege levels. Running as non-root and with read-only filesystems significantly reduces the attack surface. Pod Security Standards provide cluster-wide policies that enforce security best practices. For example, the restricted policy requires non-root execution, drops all capabilities, and uses read-only filesystems. This follows the principle of least privilege.”
Learn more about Security Contexts Learn more about Pod Security Standards
Expected Answer: Resource quotas and limit ranges provide cluster and namespace-level resource management to prevent resource exhaustion and ensure fair resource distribution.
Resource Quotas:
apiVersion: v1
kind: ResourceQuota
metadata:
name: compute-quota
namespace: production
spec:
hard:
requests.cpu: "4"
requests.memory: 8Gi
limits.cpu: "8"
limits.memory: 16Gi
pods: "10"
services: "5"
persistentvolumeclaims: "10"
Limit Ranges:
apiVersion: v1
kind: LimitRange
metadata:
name: resource-limits
namespace: production
spec:
limits:
- default:
memory: 512Mi
cpu: 500m
defaultRequest:
memory: 256Mi
cpu: 250m
type: Container
- max:
memory: 1Gi
cpu: 1000m
min:
memory: 128Mi
cpu: 100m
type: Container
- max:
memory: 2Gi
cpu: 2000m
type: Pod
Quota Types:
- Compute Resources: CPU and memory requests/limits
- Storage Resources: Persistent volume claims
- Object Counts: Number of resources (Pods, Services, etc.)
- Extended Resources: Custom resources like GPUs
Example Response: “Resource quotas set hard limits on resource consumption within a namespace, preventing any single namespace from consuming all cluster resources. Limit ranges provide defaults and constraints for resource requests and limits, ensuring Pods have reasonable resource specifications. For example, you might set a quota limiting a namespace to 4 CPU cores and 8GB memory, with limit ranges ensuring each container requests at least 100m CPU and 128Mi memory.”
Learn more about Resource Quotas Learn more about Limit Ranges
Expected Answer: Admission controllers intercept requests to the Kubernetes API server and can modify or reject requests based on policies and rules.
Admission Controller Types:
- Validating: Check requests and can reject them
- Mutating: Modify requests before persistence
- Webhook: External HTTP callbacks for custom logic
Common Admission Controllers:
- NodeRestriction: Limits node modifications
- ResourceQuota: Enforces resource quotas
- LimitRanger: Applies limit ranges
- PodSecurityPolicy: Enforces security policies (deprecated)
- ValidatingAdmissionWebhook: Custom validation logic
Webhook Example:
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
name: pod-policy.example.com
webhooks:
- name: pod-policy.example.com
rules:
- apiGroups: [""]
apiVersions: ["v1"]
operations: ["CREATE"]
resources: ["pods"]
scope: "Namespaced"
clientConfig:
service:
namespace: default
name: pod-policy-webhook
path: "/validate"
admissionReviewVersions: ["v1"]
sideEffects: None
timeoutSeconds: 5
Use Cases:
- Security Enforcement: Validate security policies
- Resource Management: Enforce resource limits and quotas
- Compliance: Ensure regulatory compliance
- Custom Business Logic: Implement domain-specific rules
Example Response: “Admission controllers act as gatekeepers for the Kubernetes API. They can validate requests before they’re persisted, mutate requests to add defaults or labels, or reject requests that violate policies. For example, a validating webhook might check that all Pods have security contexts set, or a mutating webhook might automatically add labels based on namespace. This provides a powerful way to enforce policies and implement custom business logic.”
Learn more about Admission Controllers
Expected Answer: Kubernetes provides multiple layers of isolation for multi-tenant environments, from namespace separation to cluster-level isolation.
Namespace Isolation:
apiVersion: v1
kind: Namespace
metadata:
name: tenant-a
labels:
tenant: tenant-a
environment: production
Resource Quotas per Tenant:
apiVersion: v1
kind: ResourceQuota
metadata:
name: tenant-a-quota
namespace: tenant-a
spec:
hard:
requests.cpu: "2"
requests.memory: 4Gi
limits.cpu: "4"
limits.memory: 8Gi
pods: "20"
Network Policies for Isolation:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-all
namespace: tenant-a
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
RBAC for Tenant Isolation:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: tenant-a
name: tenant-admin
rules:
- apiGroups: [""]
resources: ["pods", "services", "configmaps"]
verbs: ["get", "list", "create", "update", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: tenant-admin-binding
namespace: tenant-a
subjects:
- kind: User
name: tenant-a-admin
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: tenant-admin
apiGroup: rbac.authorization.k8s.io
Multi-tenancy Models:
- Namespace-based: Single cluster, multiple namespaces
- Cluster-based: Separate clusters per tenant
- Virtual Clusters: Kubernetes-in-Kubernetes approach
Example Response: “Multi-tenancy in Kubernetes typically uses namespaces as the primary isolation boundary. Each tenant gets their own namespace with resource quotas, network policies, and RBAC controls. Network policies prevent cross-tenant communication, while resource quotas ensure fair resource distribution. For stronger isolation, you might use separate clusters or virtual clusters. The key is balancing isolation requirements with operational efficiency.”
Learn more about Multi-tenancy
Be prepared to discuss:
- Kubernetes internals and architecture
- Performance optimization and troubleshooting
- Security best practices and compliance
- Scalability and high availability patterns
Expect questions about:
- Complex troubleshooting scenarios
- Performance bottlenecks and solutions
- Security incidents and responses
- Large-scale cluster management
Demonstrate understanding of:
- System design principles
- Trade-offs between different approaches
- Scalability considerations
- Operational complexity management
Senior roles often require:
- Team leadership experience
- Mentoring junior engineers
- Process improvement
- Strategic thinking
Advanced Kubernetes interviews test not just technical knowledge, but also architectural thinking, problem-solving abilities, and operational experience. Success requires deep understanding of Kubernetes internals, practical experience with complex scenarios, and the ability to design and implement robust solutions.
For candidates: Focus on demonstrating practical experience, architectural thinking, and the ability to solve complex problems. Be prepared to discuss real-world scenarios and trade-offs.
For interviewers: Look for candidates who can think architecturally, discuss trade-offs, and demonstrate deep technical understanding rather than just memorized knowledge.
Remember, advanced Kubernetes expertise comes from experience with complex, production environments. Focus on demonstrating practical problem-solving skills and the ability to design robust, scalable solutions.
For more information about advanced Kubernetes concepts and best practices, visit the official Kubernetes documentation and the Kubernetes.io advanced tutorials.