Platform Lifecycle Management with Flux & Concourse

Author: Thoiba Thoudam | Posted on: January 3, 2024

Alt text

Platform Lifecycle Management is a crucial element in the realm of software development, shaping and executing the procedural flow for deploying and promoting changes across multiple environments - from CI and Test to Dev, PreProd, and ultimately, Production. A robustly implemented platform lifecycle ensures the deployment of changes in a dependable, uniform, continuous, and automated fashion of platform services such as logging, monitoring infrastructures, ingress controllers, and cluster add-ons. Platform Lifecycle Management delineates the orchestrated process for propagating changes in these platform services throughout the various stages of delivery.



GitOps and Delivery Management

To ensure smooth management of our platform resources, we must leverage a tool that can seamlessly monitor Git. This tool should have the capability to automatically synchronise the desired version of the code from the Git repository with the state of our clusters. Additionally, we need a robust pipeline capable of triggering diverse tests and checks (change verification) and updating the tags of the Git repository (version updates) as part of the continuous integration and deployment process. Together, they form a cohesive, automated system thereby promoting efficiency, reliability and the level of automation required in the lifecycle management of our platform resources.



GitOps with FluxCD

GitOps is an automation approach that streamlines application deployment and infrastructure provisioning. In this methodology, Git, an open-source version control system, is the authoritative source for declarative infrastructure and application configurations. GitOps facilitates essential features such as continuous deployment (CDep) and continuous delivery (CDel) specifically tailored for Kubernetes environments. Additionally, GitOps tools include built-in monitoring capabilities to detect configuration drifts and autonomously reconcile changes that deviate from the Git repository it manages.

Flux is a popular Gitops tool offering an extensive array of features inherent to GitOps. It has excellent support for multiple cloud providers and Git implementations. It has built-in support for Helm Repositories, k8s manifest and canary deployment capabilities. Flux also benefits from a thriving user community, contributing to its versatility and widespread adoption.



Delivery Management with Concourse

We need a tool that empowers us to construct and execute declarative automated workflows, shaped as code-based pipelines. These workflows should encompass tasks such as schema verification, linting, triggering and reporting on diverse tests - ranging from unit and integration tests to NFTs (Non-Functional Tests) and cluster validation tests - and the capability to manage versioning and tagging within the infrastructure Git repository.

Concourse emerges as an adept continuous integration and deployment tool founded on a pipeline-centric approach. Comprising a sequence of jobs, each job features a built plan that specifies the input resources and defines the actions to execute when these resources undergo changes. The pipeline configuration, articulated in YAML, resides in a Git repository. Additionally, Concourse offers an exceptionally user-friendly web interface for visualising and tracking the progress of builds.



Implementation

Flux Repository Structure

Flux supports various ways of structuring the git repo as described here . We follow a mono repo approach for organising our platform services repository but with a slight variation - a separate repo for cluster (environments) and infrastructure. In this approach, we store all Kubernetes manifests, charts and dependencies for an app in the Infrastructure repo and only environment-specific Kustomizations in the Cluster repo.



Cluster Repository

Infrastructure Repository





Delivery Management with Concourse

We use trunk-based development with the main branch as our central codebase. In this methodology, changes are introduced in feature branches and subsequently merged into the main branch. Apart from the main branch, branches have a short lifespan and are deleted once their associated pull request is merged. New application releases follow a process where they are initially delivered to the CI environment through a pull request for the cluster repository. Upon merging this pull request into the main branch, the Concourse pipeline automatically generates another pull request for the next higher environment. This iterative cycle continues until the changes are deployed to production. The workflow depicted below illustrates this progression.



Alt text



Automating the creation of pull requests for different environments reduces the need for manual intervention in releasing new platform features. We adhere to a unified Git version for the entire cluster/environment, streamlining version management by steering clear of specific versions for individual platform services. While this approach brings simplicity, challenges emerge when managing the life cycles of platforms independently, especially as requirements evolve.

To address these challenges, a current strategy involves creating a separate Flux Git repository tailored to specific platform services. This helps overcome certain hurdles. However, it can be argued that in certain scenarios, managing versions per platform service might offer a more flexible solution.



Conclusion

FluxCD and Concourse collaboratively provide a robust, reliable, and efficient approach to overseeing the lifecycle of our platform resources. Leveraging the GitOps capabilities of Flux alongside Concourse’s pipeline-as-code functionality, we establish a highly automated process for seamlessly deploying platform changes across diverse environments. Concourse pipelines enforce comprehensive validation and testing before updating versions in each environment, while the Flux reconciliation process ensures that the state of our clusters aligns with our Git repository. This dual mechanism minimises the need for manual intervention during platform changes, ensuring both efficiency and stability.

While challenges persist, particularly with the PR merge approaches and the single-version model, this system is designed to adapt and evolve over time. As more platform services are incorporated, and additional developers contribute to the repository, the setup will naturally mature to meet the evolving needs of our platform.