Introduction
controller-runtime
package has become a fundamental tool for
most Kubernetes controllers, simplifying the creation
of controllers to manage resources within a Kubernetes environment efficiently. Users tend to prefer it over
client-go
.
The increased adoption of projects like Kubebuilder or Operator SDK has facilitated the creation of Kubernetes Operator projects. Users need to implement minimal requirements to start with Kubernetes controllers, thanks to these projects.
As a developer working on Kubernetes projects, I inevitably touch code pieces utilizing controller-runtime
Whenever I dive into code base, I always learn something new about the underlying mechanism of Kubernetes.
Through this blog series, I aim to share my learning regarding controller-runtime
consolidating my notes spread across various notebooks.
This article will specifically dive into the role of controller-runtime
Manager.
What are Controllers and Operators?
controller-runtime
has emerged as the go-to package for building Kubernetes controllers.
However, it is essential to understand what these controllers - or Kubernetes Operators - are.
In Kubernetes, controllers observe resources, such as Deployments, in a control loop to ensure the cluster resources conform to the desired state specified in the resource specification (e.g., YAML files) 1.
On the other hand, according to Redhat, a Kubernetes Operator is an application-specific controller 2. For instance, the Prometheus Operator manages the lifecycle of a Prometheus instance in the cluster, including managing configurations and updating Kubernetes resources, such as ConfigMaps.
Roughly both are quite similar. They provide a control loop to ensure the current state meets the desired state.
The Architecture of Controllers
Since controllers are in charge of meeting the desired state of the resources in Kubernetes, they somehow need to be informed about the changes on the resources and perform certain operations if needed. For this, controllers follow a special architecture to
- observe the resources,
- inform any events (updating, deleting, adding) done on the resources,
- keep a local cache to decrease the load on API Server,
- keep a work queue to pick up events,
- run workers to perform reconciliation on resources picked up from work queue.
This architecture is clearly pictured in client-go
documentation:
reference: client-go documentation
Most end-users typically do not need to interact with the sections outlined in blue in the architecture.
The controller-runtime
effectively manages these elements. The subsequent section will explain these components
in simple terms.
To simply put, controllers use
- cache to prevent sending each getter request to API server,
- workqueue which includes the key of the object that needs to be reconciled,
- workers to process items reconciliation.
Informer
Informers watch Kubernetes API server to detect changes in resources that we want to. It keeps a local cache - in-memory cache implementing Store interface - including the objects observed through Kubernetes API. Then controllers and operators use this cache for all getter requests - GET and LIST - to prevent load on Kubernetes API server. Moreover, Informers invoke controllers by sending objects to the controllers (registering Event Handlers).
Informers leverage certain components like Reflector, Queue and Indexer, as shown in the above diagram.
Reflector
According to godocs:
Reflector watches a specified resource and causes all changes to be reflected in the given store.
The store is actually a cache - with two options; simple one and FIFO. Reflector pushes objects to Delta Fifo queue.
By monitoring the server (Kubernetes API Server), the Reflector maintains a local cache of the resources. Upon any event occurring on the watched resource, implying a new operation on the Kubernetes resource, the Reflector updates the cache (Delta FIFO queue, as illustrated in the diagram). Subsequently, the Informer reads objects from this Delta FIFO queue, indexes them for future retrievals, and dispatches the object to the controller.
Indexer
Indexer saves objects into thread-safe Store by indexing the objects. This approach facilitates efficient querying of objects from the cache.
Custom indexers, based on specific needs, can be created. For example, a custom indexer can be generated to retrieve all objects based on certain fields, such as Annotations.
More details about how Kubernetes indexing works, check Kubernetes Client-Side Indexing.
Manager
According to godocs
manager is required to create controllers and provides shared dependencies such as clients, caches, schemes, etc. Controllers must be started by calling Manager.Start.
The Manager serves as a crucial component for controllers by managing their operations. To put it simply, the manager oversees one or more controllers that watch the resources (e.g., Pods) of interest.
Each operator requires a Manager to operate, as the Manager controls the controllers, webhooks, metric servers, logs, leader elections, caches, and other components.
For all dependencies managed by the Manager, please refer to the Manager interface
Controller Dependencies
As godocs mentioned, Manager provides shared dependencies such as clients, caches, schemes etc. These dependencies are shared among the controllers managed by the Manager. If you have registered two controllers with the Manager, these controllers will share common resources.
Reconciliation, or the reconcile loop, involves the operators or controllers executing the business logic for the watched resources. For example, a Deployment controller might create a specific number of Pods as specified in the Deployment spec.
The Client package exposes functionalities to communicate with the Kubernetes API 3. Controllers, registered with a specific Manager, utilize the same client to interact with the Kubernetes API. The main operations of the client include reading and writing.
Reading operations mostly utilize the cache to access the Kubernetes API, rather than accessing it directly, to reduce the load on the Kubernetes API. In contrast, write operations directly communicate with the Kubernetes API. However, this behavior can be modified so that read requests are directed to the Kubernetes API. Nevertheless, this is generally not recommended unless there is a compelling reason to do so.
The cache is also shared across controllers, ensuring optimal performance. Consider a scenario where there are n controllers
observing multiple resources in a cluster. If a separate cache is maintained for each controller, n caches will attempt
to synchronize with the API Server, increasing the load on API Server. Instead, controller-runtime
utilizes a shared cache
called NewSharedIndexInformer for all
controllers registered within a manager.
In the diagram above, two controllers maintain separate caches where both send ListAndWatch
requests to API Server.
However, controller-runtime
utilizes a shared cache, reducing the need for multiple ListAndWatch
operations.
reference: controller-runtime/pkg/cache/internal/informers.go
Code
Whether you use Kubebuilder,
Operator SDK, or controller-runtime
directly, operators necessitate a Manager to function.
The NewManager
from controller-runtime
facilitates the creation of a new manager.
|
|
Under the hood, NewManager
calls
New
from the manager
package.
|
|
For a simple setup, we can create a new manager as follows
|
|
Though this code piece is sufficient to create a Manager, the crucial part involves configuring the Manager
using manager.Options{}
.
Manager Options
manager.Options{}
configures various
dependencies, such as webhooks, clients, or leader elections under the hood.
Scheme
As mentioned in the godocs:
Scheme is the scheme used to resolve runtime.Objects to GroupVersionKinds / Resources.
So, scheme helps us to register your objects Go type into GVK. If you are building operators, you will realize following code block in your operator:
|
|
The scheme is responsible for registering the Go type declaration of your Kubernetes object into a GVK.
This is significant as RESTMapper
then translates GVK to GVR, establishing a distinct HTTP path for your Kubernetes
resource. Consequently, this empowers the Kubernetes client to know the relevant endpoint for your resource.
Cache
I mentioned cache a lot, but it is one of the most crucial piece of operators and controllers, where you can see its effect
directly.
As mentioned Controller Dependencies section, controller-runtime
initializes NewSharedIndexInformer
for our controllers under the hood. In order to configure cache, cache.Options{}
needs to be utilized. There are again a couple of possible configurations possible but be careful while configuring
your cache since it has an impact on performance and resource consumption of your operator.
I specifically want to emphasize SyncPeriod
and DefaultNamespaces
SyncPeriod
triggers reconciliation again for every object in the cache once the duration passes.
By default, this is configured as 10 hours or so with some jitter across all controllers. Since running a reconciliation
over all objects is quite expensive, be careful while adjusting this configuration.
DefaultNamespaces
configures caching objects in specified namespaces. For instance, to watch objects in prod
namespace:
|
|
Controller
The Controller
field, in manager.Options{}
, configures essential options for controllers registered to this Manager.
These options are set using controller.Options{}
.
Notably, the MaxConcurrentReconciles
attribute within this configuration governs the number of concurrent reconciles allowed.
As detailed in the Architecture of Controllers section,
controllers run workers to execute reconciliation tasks. These workers operate as goroutines
.
By default, a controller uses only one goroutine, but this can be adjusted using the MaxConcurrentReconciles
attribute.
After configuring the Manager’s options, the NewManager
function generates the controllerManager
structure,
which implements the Runnable
interface.
During the creation of the controllerManager
structure, controller-runtime
initializes the Cluster
to handle all necessary operations to interact with your cluster, including managing clients and caches.
All the settings provided within manager.Options{}
are transferred to cluster.New()
to create the cluster.
This process calls the private function newCache(restConfig *rest.Config, opts Options) newCacheFunc
,
initiating the NewInformer
,
which uses the type SharedIndexInformer
as referenced in the Controller Dependencies section.
The next step involves registering controllers to the Manager.
|
|
I will dive into the detailed explanation of the controller registration process in my future writings to avoid making this post excessively long.
Starting Manager
|
|
Once the manager starts, all required runnables in the manager will start, in the order of
- internal HTTP servers; health probes, metrics and profiling if enabled.
- webhooks,
- cache,
- controllers,
- leader election.
For reference, check Start(context.Context)
method of controllerManager
struct.
Feel free to suggest improvements on GitHub or through my Twitter