Kubernetes Cluster Architecture
After reading this post, you will be able to:
- Identify and understand the responsibilities of the various components in a kubernetes cluster.
We will begin by looking at the architecture at a high level, then drill down into each individual component while explaining their responsibilities and how they should be configured.
Let's get right into it.
What is the purpose of using kubernetes?
The purpose of Kubernetes is to host your applications in the form of containers in an automated fashion so that you can easily deploy as many instances of your application as required and easily enable communication between different services within your application.
For the sake of simplicity, I am going to use an analogy of ships to understand the architecture of Kubernetes.
We have two kinds of ships in this example:
- Cargo Ships: do the actual work of carrying containers across to sea.
- Control Ships: are responsible for monitoring and managing the cargo ships.
The Kubernetes cluster consists of a set of nodes which may be physical, virtual, on-premise, or on cloud that host applications in the form of containers. These relate to the cargo ships in this analogy.
The worker nodes in the cluster are ships that can load containers. But somebody needs to load the containers on the ships among, other tasks such as:
- Plan how to load.
- Identify the right ships.
- Store information about the ships.
- Monitor and track the location of containers on the ships.
- Manage the whole loading process etc.
This is done by the control ship that hosts different offices and departments, monitoring equipments, communication equipments, cranes for moving containers between ships etc. The control ships relate to the master node in the kubernetes cluster, which is responsible for managing the cluster. Tasks done by the control ships include:
- Storing information regarding the different nodes.
- Planning which containers goes where.
- Monitoring the nodes as well as the containers on them.
The Master node does all of these tasks with the help of a set of components known as the control plane components.
Control Plane Components
We will look at each of these components now.
You could imagine that many containers are being loaded and unloaded from the ships on a daily basis. As a result, we must keep track of information about the different ships, such as:
- What container is on which ship.
- At what time was each container loaded.
All of these are stored in a highly available key value store known as ETCD, which is a database that stores information in a key-value format.
We will look more into what ETCD cluster actually is, what data is stored in it, and how it stores the data in another post.
When ships arrive to the master cargo ship, you load containers on them using cranes, which identify the containers that need to be placed on ships based on its size, capacity, the number of containers already on the ship and any other conditions such as:
- The destination of the ship.
- The type of containers it is allowed to carry.
These are known as schedulers. In a Kubernetes cluster, as scheduler identifies the right node to place a container on, based on:
- The containers resource requirements.
- The worker nodes capacity.
- Any other policy or constraints such as taints and tolerations or node affinity rules that are on them.
We will look at these in much more detail with examples in the next section. We have a whole section dedicated for scheduling alone.
There are different offices in the master ship that are assigned to special tasks or departments. For example, the operations team takes care of ship handling traffic control etc. They deal with issues related to damages, the routes the different ship take etc. The cargo team takes care of containers, when containerss are damaged or destroyed they make sure new ones are made available. Also, there exists a service office that takes care of the I.T. and communications between different ships.
Similarly, in Kubernetes, we have controllers available that take care of different areas.
The node-controller takes care of nodes. They're responsible for onboarding new nodes to the cluster, handling situations where nodes become unavailable or gets destroyed. The replication controller ensures that the desired number of containers are running at all times in your replication group.
Communication in the Clusters
So we have seen different components like the different offices the different ships the data store the cranes.
But how do these communicate with each other?
How does one office reach the other office and who manages them all at a high level?
The kube-api server is the primary management component of kubernetes. The kube-api server is responsible for orchestrating all operations within the cluster.
It exposes the Kubernetes API which is used by externals users to perform management operations on the cluster, as well as the various controllers to monitor the state of the cluster and make the necessary changes as required and by the worker nodes to communicate with the server.
What About Containers?
Now, we are working with containers here. Containers are everywhere, so we need everything to be container compatible. Our applications are in the form of containers: the different components that form the entire management system on the master nodes could be hosted in the form of containers. The DNS service networking solution can all be deployed in the form of containers. So we need some software that can run containers and that's the container runtime engine. A popular one being Docker.
So we need Docker or it's supported equivalent installed on all the nodes in the cluster including the master node if you wish to host the control plane components as containers.
Can it Only Be Docker?
No it doesn’t always have to be Docker. Kubernetes supports other run time engines as well like ContainerD or Rocket.
Moving On To the Cargo Ships (Nodes)
Every cargo ship has a captain which is responsible for:
- Managing all activities on these ships.
- Liaising with the master ships starting with letting the master ship know that they are interested in joining the group
- Receiving information about the containers to be loaded on the ship.
- Loading the appropriate containers as required.
- Sending reports back to the master about the status of this ship and the status of the containers on the ship etc.
Now, the captain of the ship is the kubelet in Kubernetes. A kubelet is an agent that runs on each node in a cluster. It listens for instructions from the kube-api server and deploys or destroys containers on the nodes as required.
The kube-api server periodically fetches status reports from the kubelet to monitor the state of nodes and containers on them.
The kubelet was more of a captain on the ship that manages containers on the ship. But the applications running on the worker nodes need to be able to communicate with each other. For example you might have a web server running in one container on one of the nodes and a database server running on another container. How would the web server reach the database server on the other node?
Communication Between Worker Nodes
Communication between worker nodes are enabled by another component that runs on the worker node known as the Kube-proxy service. The Kube-proxy service ensures that the necessary rules are in place on the worker nodes to allow the containers running on them to reach each other.
In this post, we have covered a LOT of topics and kubernetes objects. Let's quickly summarize what we have learned so far:
The Kubernetes Cluster has a master and worker nodes.
On the master Ship, the ETCD server operates, which stores information about the cluster.
The Kube Scheduler is responsible for scheduling applications or containers on Nodes.
A variety of Controllers take care of different functions like the node control, replication, controller etc..
The Kube API Server is responsible for orchestrating all operations within the cluster on the worker node.
Kubelets that listens for instructions from the Kube-API Server and manages containers.
And Finally, the Kube-proxy helps in enabling communication between services within the cluster.
So that's a high level overview of the various components. We will drill down into each of these in the upcoming posts.