
ArgoCD is a popular Kubernetes-based declarative deployment tool, and for good reason! It has a polished Web and CLI-based UI, both supported by a general ArgoCD API and a flexible RBAC model. It’s designed to get you going quickly, leveraging familiar tooling such as Helm or Kustomize to manage the Kubernetes manifests you deploy to your clusters.
flowchart LR subgraph NS_ARGOCD["<code>Namespace</code> **argocd**"] ArgoCD[ArgoCD] ARGO_APP1["<code>Application</code><br>**app-1**"] ARGO_APP2["<code>Application</code><br>**app-2**"] end subgraph NS_APP1["<code>Namespace</code><br>**app-1**"] APP1[k8s<br>Manifests] end subgraph NS_APP2["<code>Namespace</code><br>**app-2**"] APP2[k8s<br>Manifests] end ArgoCD --reconciles--> ARGO_APP1 ArgoCD --reconciles--> ARGO_APP2 ARGO_APP1 --defines<br>via helm--> APP1 ARGO_APP2 --defines<br>via kustomize--> APP2 ArgoCD -.manages.-> APP1 ArgoCD -.manages.-> APP2
A typical ArgoCD deployment, reconciling multiple Application
objects and the resources they define (via helm and kustomize) in different namespaces.
In our experience across multiple clients, we’ve seen some pitfalls that are easily fallen into when introducing ArgoCD, especially when it comes to multi-tenanted platforms. At CECG, whenever we build a platform, introduce a technology, or add a new capability, we have a set of Core Principles that guide our design and implementation, including:
- Tenant Autonomy – Tenants should not be blocked by approvals from Platform Operators unless strictly required. They should have as much autonomy as possible within their owned areas.
- Tenant Isolation – Tenants should never be able to impact other tenants by exercising autonomy within their owned areas.
- Automated Progressive Continuous Delivery – All releases, by tenants and platform operators, should progressively roll out to validate and gain confidence before being deployed in front of end-users
With these principles in mind, we can discuss the lessons we’ve learned when leveraging ArgoCD and how we navigated the pitfalls.
Tenant Autonomy and Isolation
Often, we see ArgoCD deployed and configured in such a way that it actively prevents tenants from managing Application
and ApplicationSet
objects themselves, requiring the coordinated use of shared namespaces typically via an approval-based process, creating a bottleneck dependency on a central team and breaking tenants’ autonomy. This bottleneck often acts as a limiter to scaling the operation of core platforms where you ideally should be able to support hundreds of teams and thousands of applications with a lean core platform team.
flowchart LR subgraph TENANTS["Tenant Teams"] T1["Team 1"] T2["Team 2"] TX["..."] TN["Team N"] end R["Shared Namespace<br>Owner"] T1 --<code>Application</code><br>Pull Requests--> R T2 --<code>Application</code><br>Pull Requests--> R TX --<code>Application</code><br>Pull Requests--> R TN --<code>Application</code><br>Pull Requests--> R R --Responsible for<br>merges into --> REPO["Application Repository"]
A model with shared usage of a namespace being coordinated by a single team creates a bottleneck limiting how many tenants can be supported.
Due to the way ArgoCD’s RBAC model works, there is a knock on security issue where all Application
objects are freely able to acquire the same permissions as any other Application
in the same namespace. These permissions include the namespaces resources can be managed in and the types of Kubernetes resource that can be managed. This breaks tenant isolation where two tenants who are placing Application
s in the same namespace could acquire each other’s permissions unless whoever approves the changes is aware of all team members when approving every change.
In our multi-tenanted Kubernetes-based Core Platforms, we typically map a Tenancy to a set of owned Namespaces and tenants are given full control over namespace-scoped Kubernetes objects within those namespaces. Further in this article, we’ll seek to capitalize on this setup to avoid the aforementioned isolation and autonomy issues.
Continuous Deployment vs Delivery
The keen-eyed amongst you may notice that I did not introduce ArgoCD as a GitOps or Continuous Delivery tool specifically, opting to class it as a Kubernetes-based declarative Deployment tool. The reasons for this are:
- ArgoCD, by itself, gives you continuous reconciliation of a single thing you want to ensure is deployed to some cluster, not a full end-to-end delivery with validation and promotions
- ArgoCD can be leveraged entirely without any Git repos being involved.
This is not to diminish ArgoCD in any way, it is very capable in its intended functionality, and the usage of Git is simply one mode of operation that it supports amongst many useful options. To achieve an end-to-end Continuous Delivery with automated, progressive rollout, we need additional orchestration. This orchestration signals when ArgoCD should update a given deployment in a given cluster in the context of some rollout across multiple deployments towards its end users.
A common approach we see is introducing ArgoCD and having automated pipelines finish by updating a single Application
version after successful merge which then immediately rolls out to all clusters (either via a GitOps repo, or via an API apply). The takeaway should be that ArgoCD alone will not give you Continuous Delivery, but it provides you with powerful deployment functionality as part of a Continuous Delivery end-to-end.
Now that we’ve outlined the principles we use to approach Platform Engineering and identified some of the pitfalls we’ve seen come up, we can leverage the tools ArgoCD has available to solve them!
Recommendations for Configuring ArgoCD
This brings us onto the final pieces of the picture:
- Leveraging
AppProjects
for multi-tenancy with isolation and autonomy - We need to configure ArgoCD to watch
Application
andApplicationSet
objects in all namespaces - We need to lock down the default project
Argo Projects to Provide Isolation and Autonomy
In ArgoCD, when you create an Application
object, it is always associated with an ArgoCD Project (an AppProject
object). If you specify no project (or an invalid one), your application will be assigned to the default project. ArgoCD projects allow you to specify a number of interesting constraints on their associated applications, including:
- A list of namespaces the
Application
objects must reside in - A list of namespaces and clusters the
Application
objects can manage resources in - An allow/blocklist of namespace-scoped resources that
Application
objects can manage - An allowlist of cluster-scoped resources that
Application
objects can manage
Given we typically capture a tenant as owning a set of namespaces (and a namespace is never actively owned by more than one tenant), we can directly map this into the creation of one AppProject
per tenant which:
- Allows
Application
(andApplicationSet
) objects to reside in any namespace owned by the tenant - Allows
Application
objects to manage Kubernetes resources in any namespace owned by the tenant - Allows the tenant team to manage only permitted namespace and cluster-scoped Kubernetes objects
The image below shows an example of what this might look like with two tenants (“Tenant A” and “Tenant B”) each with two namespaces (“A1”, “A2”, “B1”, “B2”):
flowchart TD subgraph ARGOCD[ArgoCD Namespace] direction LR PROJECT_A[Tenant A<br>AppProject] PROJECT_B[Tenant B<br>AppProject] end A[Tenant A] subgraph NS_A_1[Namespace A1] APP_A1[Application] end subgraph NS_A_2[Namespace A2] APP_A2[Application] end A -- owns --> NS_A_1 A -- owns --> NS_A_2 APP_A1 -- belongs to --> PROJECT_A APP_A2 -- belongs to --> PROJECT_A B[Tenant B] subgraph NS_B_1[Namespace B1] APP_B1[Application] end subgraph NS_B_2[Namespace B2] APP_B2[Application] end B -- owns --> NS_B_1 B -- owns --> NS_B_2 APP_B1 -- belongs to --> PROJECT_B APP_B2 -- belongs to --> PROJECT_B
Multiple Applications
referencing the respective AppProject
for their tenancy.
In this scenario:
- The Tenant A
AppProject
lists namespaces A1 and A2 as the permitted namespaces forApplications
to reside in, and for Kubernetes manifests to be managed in - The Tenant B
AppProject
lists namespaces B1 and B2 as the permitted namespaces forApplications
to reside in, and for Kubernetes manifests to be managed in
Assuming Tenants A and B can only create Kubernetes objects in their respective namespaces, enforced by Kubernetes RBAC, any Application
object created in those namespaces must reference an AppProject
which accepts applications from that namespace, and only tries to manage resources in the tenant namespaces.
If an Application
in namespace A1 attempts to reference the Tenant B AppProject
, ArgoCD would see that the Tenant B project only accepts Application
objects that exist in the B1 or B2 namespace, and will report an error that it will not even attempt a deployment due to violating the permitted namespace configuration.
Namespaces
. Each tenant’s AppProject
must be kept up to date with the dynamic list of namespaces they own. We will tackle this problem later in the A new problem: keeping Tenant namespaces up-to-date
section.Watching Applications in all Namespaces
Configuring ArgoCD to watch all namespaces depends on how you install it, if you’re using the Helm chart, you can pass in some extra configuration via the chart values:
configs:
params:
# -- Configure the namespaces the AppSet and App controllers watch
applicationsetcontroller.namespaces: "*"
application.namespaces: "*"
applicationSet:
# -- Create cluster role bindings allowing AppSet controller to read from all namespaces
allowAnyNamespace: true
Locking down the default project
Earlier, we mentioned that all Applications
must reference an AppProject
and if that reference is missing or invalid, the default project will be used. This default project is created by ArgoCD on first startup if a default project does not already exist. The default project is configured to:
- Allow Applications to reside in any namespace
- Allow Applications to manage resources in any namespace in any cluster
- Allow Applications to manage all namespace-scoped resources
- Allow Applications to manage all cluster-scoped resources
This is a problem as it allows anybody to immediately escape the confines of their Tenant based AppProject
and use ArgoCD to manage any resource it has permission to. Thankfully, you can override this default project, and ArgoCD will respect the changes because it only creates the default project when one is missing. We update the default project to the following:
apiVersion: argoproj.io/v1alpha1
kind: AppProject
metadata:
name: default
namespace: argocd
spec:
clusterResourceWhitelist: []
destinations: []
sourceRepos: []
sourceNamespaces: []
This sets the list of source and destination namespaces to an empty list (== no namespaces permitted), and now any Application
which ends up referencing the default project will be flagged in violation of the permitted namespaces and ArgoCD will refuse to even try to synchronize its resources.
ArgoCD as part of a Continuous Delivery flow
Now that we have a way to configure ArgoCD for multi-tenant operation, with autonomy and isolation guarantees, we can think about how to build an end-to-end Continuous Delivery flow incorporating ArgoCD as the deployment tool. For this we assume the following exists:
- Each Tenant has their own set of automated pipelines (e.g., Jenkins, GitHub Actions, Tekton)
- These pipelines have access to the namespaces owned by the Tenant and can authenticate and authorize to manage them
- These pipelines have access to an OCI registry to be able to publish Container Images and Helm Charts
- All Kubernetes clusters have access to pull images and helm charts from the OCI registry
With these in place, we can imagine a simple Continuous Delivery flow like the following:
- PR raised, tested, approved, and merged
- Automated pipeline publishes a new version of the container images and Helm charts (e.g., v2.0.0)
- On demand, or at a regular interval, the most recent version $v is deployed to **dev **as an ArgoCD
Application
- When the application is **synchronized **and healthy, launch a test suite against stubbed dependencies
- Wait for all tests to pass, optionally wait to see if any alerts are fired
- If everything is OK: immediately, or after a configured delay, promote version $v to integration as an ArgoCD
Application
- When the application is synchronized and healthy, launch a test suite against real dependencies
- Wait for all tests to pass, optionally wait to see if any alerts are fired
- If everything is OK: immediately, or after a configured delay, promote version $v to canary as an ArgoCD
Application
- When the application is synchronized and healthy, wait to see if any alerts are fired
- If everything is OK: immediately, or after a configured delay, promote version $v to production as an ArgoCD
Application
- When the application is synchronized and healthy, $v is successfully rolled out
In CECG we call this a **Path to Production **(P2P) and we use similar approaches to manage both Infrastructure and Workloads. You’ll notice that the way ArgoCD is integrated into this end-to-end follows a repeated and identical pattern of
- Deploy version $v to some environment $e by updating its’
Application
- Wait for an
Application
to be **synchronized **and healthy
Deploy a version to an environment
Deploying a specific version of your application in this scenario involves updating the spec.source.targetRevision
property of an Application object to the new version you have just published. This will trigger ArgoCD to pull the new version of your Helm chart and use it to render the Kubernetes manifests into your desired namespace(s), which also reference the new container image versions that your pipeline has published.
The simplest way to trigger an update to your Application
object is to have your pipeline interact directly with the Application
object either via kubectl
or the argocd
CLI, like this:
# Patch the desired version into an Application object
kubectl patch application <name> -n <namespace> --type='merge' -p '{"spec":{"source":{"targetRevision":"<new-version>"}}}'
# Update the entire Application object
kubectl apply -f application.yml
# Using ArgoCD CLI
argocd app set <name> -n <namespace> --revision <new-version>
Any of these options will allow you to patch the targetRevision
property directly, triggering a new deployment.
The alternative to directly managing these objects is to commit an update to your Application
object to some GitOps repo which ArgoCD is configured to pull in your target clusters, this approach usually entails following what ArgoCD calls as an “App of Apps
” pattern in which:
- All of your
Application
objects are stored in some repo likeacme/gitops
under a folder named something likeapps/
- You create a “root”
Application
which automatically deploys all YAML files found underapps/
in theacme/gitops
repo to your cluster - Any changes to the YAML files under the
apps/
folder are pulled from the repo into the cluster by ArgoCD
The main difference between directly managing Application objects and managing them via a Git repository is:
- Your pipelines need to be given additional permissions to commit to the GitOps repo
- Your pipelines need to factor in the extra time taken for a Git commit to be seen by ArgoCD and result in an update to the Application object on the cluster
For both approaches, the next step is to wait for an Application
to be synchronized and healthy
Application sync and health status
After an update is triggered to an Application
object, we need to give ArgoCD time to do its deployments, and wait for any managed Kubernetes resources to get into a ready state. All of this information is contained in the status
property for any Application
object, particularly:
- The sync revision of an Application (
.status.sync.revision
) details the version of the Helm chart ArgoCD is trying to render - The sync status of an Application (
.status.sync.status
) details the result of the attempt to use Helm to render Kubernetes resources, with a state ofSynced
,OutOfSync
orUnknown
- The health status of an Application (
.status.health.status
) details the aggregated health of all Kubernetes resources managed by the current version of the application, with a state of:Healthy
,Degraded
,Progressing
,Suspended
,Missing
orUnknown
Deployment
, Ingress
, Pod
) are healthy. Extensible mechanisms exist where you can configure ArgoCD with additional rules to assess the health of custom Kubernetes types, the ArgoCD docs
provide guidance.Exactly how you implement interrogation of these properties will depend on your chosen technologies/languages, but this gives an overview of how you can assess sync and health status as part of pipeline execution.
All of this together gives you a Path to Production where a version is progressively rolled out across environments and validated through different testing phases to gain confidence before it releases in front of end-users.
A new problem: keeping Tenant namespaces up to date
One issue introduced by the approach outlined in this post is that now we have a collection of AppProject
objects that need to be kept up to date as tenant namespaces come and go. The frequency this needs to happen depends a lot on how much autonomy you’ve given tenants within your platforms. For a variety of reasons we have clients at both ends of the “every namespace must be manually approved” to “tenants can create as many namespaces as they want without approval.”
Where namespaces can come and go quite quickly, we identified that it would be useful to have some constant reconciliation-based approach to identifying tenants and the namespaces they owned and using that to keep the AppProject
objects up to date. To this end, we created an operator named argocd-tenant-project-manager
which does the following:
- Scans for Namespaces that define a tenancy, based on label/annotation-based selectors (e.g.,
cecg.io/is-tenant: true
) - Extracts the name of the tenant from the matched namespace (using the name of the namespace, or the value of a label/annotation)
- Scans for Namespaces that belong to a tenancy, based on label/annotation-based selectors (e.g.,
cecg.io/tenant: tenant-A
) - Ensures an
AppProject
with the same name as the tenant exists, and is configured with all of the namespaces owned by that tenant
The controller was built using Kubebuilder and allows us to quickly leverage ArgoCD at our clients in line with our principles of tenant autonomy and isolation. Look out for a post with us open-sourcing the operator soon!
Summary
Alright, let’s wrap this up! First, we dove into using ArgoCD in a multi-tenanted platform. We learned that out of the box, ArgoCD’s default settings can be at odds with our principles of tenant autonomy and isolation.
To fix this, we looked at using ArgoCD Projects to create boundaries. Each tenant gets their own AppProject, which defines what namespaces they can deploy to and what resources they can manage. This keeps everyone in their own lane. We also made sure ArgoCD watches for applications in all namespaces and locked down the default project to prevent any sneaky escapes.
We also covered that while ArgoCD is great for deployments, it’s not a full Continuous Delivery solution on its own. We need to integrate it into our automated workflows, like Jenkins or GitHub Actions. That means triggering ArgoCD to deploy new versions to different environments (dev, integration, etc.) and waiting for it to sync and confirm everything’s healthy before moving on to the next stage.
Finally, we touched on keeping those AppProjects up to date. If you have tenants creating namespaces all the time, you’ll need a way to automatically reconcile and update the projects. That’s where our argo-tenant-project-manager
operator comes in, which keeps everything in sync based on namespace labels and annotations.
In a nutshell, we’ve learned how to make ArgoCD play nice in a multi-tenanted setup, ensuring tenants have the freedom they need while staying nicely isolated. Our full CECG guidance on leveraging ArgoCD in your organization also covers topics beyond the scope of this post, including production installation and configuration, high availability, SSO integration and using ArgoCD’s custom RBAC.
If you’d like to know more about how we build Developer Platforms, Paths to Production, how we evaluate Key Technologies like ArgoCD, or how we could help you with anything Platform Engineering, reach out !