What’s the point of Operators and CRDs? A seasonal reflection

Author: Geoff Macartney | Posted on: December 18, 2023



It is that time of year when our thoughts naturally turn to pondering the things in life that truly matter. Like Kubernetes Operators, for instance. What’s the point of Operators and CRDs? Searching around for a definition, you find many websites that tell you something along the lines that an operator is a way of extending the API of Kubernetes, or a method of packaging and deploying an application. But so what? Why is this a good thing? How would I explain the value of Operators to someone myself?

I was pondering such questions a couple of nights ago, sitting by the fire with a fine Bordeaux, when a chill fell upon me. A dark figure arose at my side, indistinct and unearthly. It was the Ghost of Kubernetes Past. At once it whirled me away to a time long ago.

Or not so long ago. It can feel like Kubernetes has been around forever , but it is still less than a decade since it was open-sourced as a framework for managing (containerised) software. It provided a toolkit to solve a set of common problems that enterprises faced as they began to adopt a distributed, containerized, “microservices” style approach to software delivery. Application operators and developers learned a whole new vocabulary of Pods, Deployments, Ingresses, ConfigMaps, ClusterRoles and a host of other concepts, and embraced the hot new tech with enthusiasm. Kubernetes’ popularity grew exponentially, and it rapidly became the preferred platform for deploying applications (in the broadest sense) composed of many interworking parts requiring flexible scaling and high availability.

But along with all this new technology came a whole new set of challenges and problems. For one thing, Kubernetes itself was hard enough to manage and to keep up to date. The eventual arrival of managed Kubernetes solutions from the public cloud providers went some way to mitigating that. However, designing applications to run on Kubernetes required development teams to overcome new challenges dealing with Kubernetes itself as if they didn’t have enough to deal with. Too often each team building to run on Kubernetes was spending a significant fraction of their effort on Kubernetes networking, security, and scaling. Anything requiring stateful operation was going to be difficult, as Kubernetes is principally designed for scaling stateless applications. And above all, how did you go about integrating Kubernetes with other, non-Kubernetes, applications? Integrating your application running on Kubernetes with software like databases or storage management systems, or anything “real world” (however loosely defined), required the development of layers of additional management software, and every team was solving similar problems, each in their own way.

Motivated by such concerns, a few years after the release of Kubernetes , CoreOS introduced the concept of Kubernetes Operators, to help deal with this friction between Kubernetes and the real world. The notion of an Operator essentially formalises a way of bringing something that lives outside Kubernetes under control by software running inside Kubernetes. By representing real-world resources as Kubernetes objects (“custom resources”), defined by a Custom Resource Definition (CRD). Operators enable the integration of those resources into an application in the same way as the familiar Kubernetes resources, for example, by controlling access to them using Kubernetes RBAC (Role-based access control). Additionally, Operators can orchestrate resources that do run on Kubernetes to work together as a higher-level abstraction that encapsulates a range of concerns in one resource, say adding business logic and policy management to existing resources.

At first, writing an Operator was challenging enough, requiring a decent understanding of the various code mechanisms that Kubernetes used, but this became easier with the release of the Operator Framework after another few years. This automated most of the boilerplate, so you only need to implement one function to make your Operator work.

The vision passed, and the Spirit fled. I had my answer - the point of an Operator is to extend the control plane of Kubernetes to cover anything; to let you bring non-Kubernetes things seamlessly into Kubernetes, and mix the lot together as a higher-level entity. But there was to be no respite. The very next night, upon the hour of One, appeared a jolly Giant, clad in a T-shirt and jeans, who bore a glowing smartphone. “I am the Ghost of Kubernetes Present”, said the Spirit. And on the phone I saw developers all over the world today, writing Operators. Should you join them?



Should I write an Operator?

There are lots of teams nowadays writing Operators for all sorts of things. I’m sure the list on Operator Hub doesn’t even begin to scratch the surface.

If you’re developing on Kubernetes and thinking about whether you should write an Operator, you may have held back from looking into it for a variety of reasons. Maybe it seems too complex, or too much effort. Maybe you’ve been getting along fine without one and doubt there’s a cost-benefit to be had from the investment of time that developing an Operator would take. Maybe you feel your use case is not a good fit for an Operator.

I’ve put these reasons in what I feel is an increasing order of merit. As for effort, these days that should not be a concern. The Operator SDK has greatly simplified the development process, compared to the custom controllers that had to be written in the past. There are lots of resources available to help you get going. The Getting Started pages for the Operator SDK are useful, but read them along with those for the Kubebuilder book . There are plenty of other tutorials out there. Be sure to read the blog post from our colleague Chris O’Quinn, 10 + 1 Things I wish I knew about operators before I wrote one .

You will make your cost-benefit analysis; but one of the benefits of developing your own Custom Resources is precisely that you can bring a new level of management to resources, which ultimately translates into business value by reducing management and maintenance costs of your software, avoiding problems stemming from heterogeneous approaches to deployment, and speeding the velocity of software delivery.

The “fit” of your use case is perhaps the most significant reason to be cautious about adopting the use of an Operator. Not every problem will naturally be addressed by writing an Operator. When are CRDs and Operators appropriate? They are designed to work not merely by extending the APIs, but by extending what you might call the philosophy of Kubernetes to other things. Specifically, they are appropriate in a case where the following sorts of statements apply.

  • Your application supports a declarative specification of resources. If you can write down a static specification that describes what a resource you want to manage should look like, then you can use an Operator invoked by a control loop in Kubernetes to work to make the “real world” conform to that specification.
  • You want to capture domain-specific knowledge in code to orchestrate resources within and outside Kubernetes in a coordinated way.
  • You want to encapsulate and control a range of concerns on behalf of your clients or tenants to offer them self-service solutions. Your Operator can take care of matters like RBAC, policy management, network access, out-of-the-box metrics, audit logs, cost management, scalability and HA, constraining authorised and authenticated access to external resources on SaaS and infrastructure platforms - Cloud providers, Salesforce, SAP, NetApp and so on.

Some examples of Operators from my own experience:

  • A custom Ingress resource, hiding from client teams the complexities of defining their own Ingress rules, and providing sophisticated network access using a modern ingress solution based on Contour .
  • A storage Operator, integrating NetApp block and filesystem storage seamlessly into a cloud services platform managed with Kubernetes.
  • Operators support a multi-tenant developer platform, by providing custom resources like a Tenant, to provision and manage the resources tenants require in a consistent and secure manner.

The screen dimmed; the second Ghost withdrew.



Where next?

I await with equal eagerness and fear the Ghost of Kubernetes Future. What will it bring? It seems to me that as we have gained comfortable familiarity with Kubernetes we are increasingly extending the scope of our control and orchestration capabilities in new and varied ways. Today we are all supposed to be DevOps engineers - but too often that means not “you build it, you run it”, but rather that everyone has to be both a development expert and also in platform orchestration.

I look forward to simplifications built on things like Operators and API server aggregation , or on tools like Kratix and Crossplane which enable the “operator loop” without having to write actual operators. The platforms we build will become at once increasingly capable, and increasingly simple to use, allowing developers to free their minds from the ops concerns that they must deal with today, and concentrate on writing code that creates business value. CECG’s Developer Platform is perhaps a harbinger of this approach, providing a near-turnkey solution that clients can use to rapidly get up and running on a modern development platform.

I do not know what the future holds. Perhaps the Ghost will show me, perhaps not. But I can tell you this, it’s going to be exciting.



Epilogue

There are many fine points of detail when it comes to CRDs. How do you do a Server Side Apply on a custom resource? What about watching them with informers? Another important topic when it comes to CRDs is version management of the API for the resources, which needs careful thought. But time fails me; I hear a ghostly step in the hall. All that will have to be a story for another day.