Troubleshooting Kubernetes services and networking for the CKA exam requires a systematic approach, tracing the packet flow from the source Pod to the destination. Issues generally fall into four categories: Pod Networking, Service Discovery (DNS), Service Configuration, and Network Policies.
1. *…Troubleshooting Kubernetes services and networking for the CKA exam requires a systematic approach, tracing the packet flow from the source Pod to the destination. Issues generally fall into four categories: Pod Networking, Service Discovery (DNS), Service Configuration, and Network Policies.
1. **Pod Networking (CNI)**: First, ensure Pods are `Running` and have IP addresses (`kubectl get pods -o wide`). If Pods cannot communicate via IP, check the CNI plugin (e.g., Calico, Flannel, Weave). Verify the CNI pods are running in the `kube-system` namespace and inspect their logs for errors.
2. **Service Discovery (DNS)**: If IP communication works but Service names fail, check CoreDNS. Verify CoreDNS pods are active. launch a temporary busybox pod to test resolution: `kubectl run test --image=busybox:1.28 --rm -it -- nslookup <service-name>`. If this fails, check `/etc/resolv.conf` inside the Pod.
3. **Service Configuration & Endpoints**: If DNS resolves but the connection times out, check the Service definition. A common error is a mismatch between the Service `selector` and the Pod `labels`. Run `kubectl get endpoints <service-name>` to verify that the Service targets actual Pod IPs. If the endpoints list is empty, the selector is incorrect. Also, ensure `kube-proxy` is running on the nodes, as it manages the iptables/IPVS rules that route Service traffic.
4. **Network Policies**: If configuration is correct but traffic is dropped, check for `NetworkPolicy` objects. By default, all traffic is allowed, but a restrictive policy might deny Ingress or Egress traffic.
Essential tools for this process include `nslookup`, `curl`, `nc` (netcat), `ip a`, and `kubectl describe`.
Troubleshooting Services and Networking
Why it is Important: In the context of the CKA exam and real-world operations, networking is the backbone of the cluster. If services cannot discover each other or if external traffic cannot reach the application, the system is fundamentally broken. Troubleshooting networking is a high-weight domain in the CKA exam because it proves you understand the underlying architecture of how Kubernetes connects components.
What it is: Troubleshooting Services and Networking is the systematic process of diagnosing connectivity failures. This encompasses issues with CoreDNS (name resolution), Services (ClusterIP, NodePort), Ingress resources, Network Policies (firewalls), and the CNI Plugin (pod-to-pod communication).
How it works: Kubernetes networking relies on a chain of dependencies. When a request fails, you must verify each link in the chain: 1. DNS: Translates the Service name to a Cluster IP. 2. Service Definition: Uses Selectors to find backing Pods. 3. Endpoints: The actual list of Pod IPs populated by the Service. 4. Kube-Proxy/CNI: Handles the routing rules (iptables/IPVS) to forward traffic from the Service IP to the Pod IP.
How to Answer Questions: When faced with a broken application in the exam, follow this diagnostic workflow: 1. Check the Pods: Ensure the application pods are actually Running. 2. Check the Service: Run kubectl get svc to verify the IP and ports. 3. Check the Endpoints (Crucial): Run kubectl get ep <service-name>. If the ENDPOINTS column is empty or says <none>, the Service selector does not match the Pod labels. This is the most common exam error. 4. Test DNS: Deploy a busybox pod to test internal resolution: kubectl run test --image=busybox:1.28 --rm -it -- restart=Never -- nslookup <service-name>. 5. Check Network Policies: If everything looks correct but traffic is blocked, check for a NetworkPolicy preventing access.
Exam Tips: Answering Questions on Troubleshoot services and networking: 1. Verify TargetPort: A common configuration error is a mismatch between the Service port (what the service exposes) and the targetPort (what the container listens on). 2. Use Netshoot or Busybox: Be comfortable spinning up a temporary pod to run curl or nslookup from inside the cluster network. 3. Check Kube-Proxy: If Services exist but act strangely, check if the kube-proxy pod is running on the node. 4. CNI Issues: If Pods are stuck in ContainerCreating with network errors, check the logs of the CNI plugin (e.g., Calico, Flannel, Weave) in the kube-system namespace.