Latest Blogs

Discover insights, tutorials, and thoughts from our community. Stay updated with the latest trends in platform engineering, DevOps, and software development.

Filters

Crossplane in the Trenches: Deletion Pitfalls and How to Prevent Them
Dilyan Kostov7 Min Read
Crossplane in the Trenches: Deletion Pitfalls and How to Prevent Them

Learn about Crossplane's deletion policies and how improper handling can lead to orphaned cloud assets. This guide covers Kubernetes Admission Protection, Crossplane Delete Policies, Usages for Dependency Ordering, and more to help you manage your infrastructure safely.

CrossplaneKubernetes+5
August 12, 2025
How We Built Our Core Platform: A Frontend Technology Journey
Jescard Tamer25 Min Read
How We Built Our Core Platform: A Frontend Technology Journey

The journey from Backstage to Next.js in building CECG's Core Platform dashboard, exploring technology choices, performance optimization, and the challenges of creating a complex platform interface.

FrontendNext.js+5
June 11, 2025
mise vs. nix-shell
CECG Engineering8 Min Read
mise vs. nix-shell

A deep dive into mise and nix-shell, two modern tools for managing development environments. This post compares their philosophies, features, and practical applications to help you choose the right solution for ensuring consistency and reproducibility in your projects.

DevEnvmise+5
May 9, 2025
Leveraging ArgoCD in Multi-tenanted Platforms
Derek Mortimer14 Min Read
Leveraging ArgoCD in Multi-tenanted Platforms

Learn how to leverage ArgoCD in multi-tenanted platforms while maintaining tenant autonomy and isolation. This post covers best practices for configuring AppProjects, managing continuous delivery flows, and automating tenant namespace updates to avoid common pitfalls.

ArgoCDMulti-tenancy+5
April 9, 2025
semver-utils: Streamlined semantic versioning from pipelines
Derek Mortimer6 Min Read
semver-utils: Streamlined semantic versioning from pipelines

Introducing semver-utils, an open-source tool for streamlined semantic versioning from automated pipelines. This post covers how to fetch and create semantic version tags in Git repositories, manage multiple version sets with prefixes, and integrate with CI/CD workflows.

Semantic VersioningGit+5
March 30, 2025
Multi-Node LLM Serving Using sig LWS and vLLM
Jingkai He9 Min Read
Multi-Node LLM Serving Using sig LWS and vLLM

Learn how to serve large language models on multiple Kubernetes nodes using sig LWS and vLLM. This guide covers the challenges of multi-node inference, the architecture of LeaderWorkerSet, and practical tips for deployment, observability, and efficient model loading.

LLMKubernetes+6
March 24, 2025
Exploring AIOps
CECG Engineering4 Min Read
Exploring AIOps

Explore how AIOps and Grafana Cloud are transforming IT operations from reactive to proactive. This post details our journey through forecasting CPU usage and enhancing incident investigation, sharing key lessons and future directions in intelligent IT systems.

AIOpsGrafana+5
March 19, 2025
Deploying Local LLMs for Sentiment Analysis in Platform Engineering
Senna Semakula4 Min Read
Deploying Local LLMs for Sentiment Analysis in Platform Engineering

Explore the challenges and trade-offs of deploying local LLMs for sentiment analysis in a platform engineering context. This post covers resource constraints, model accuracy, observability, and the build vs. buy decision to help platform teams integrate AI-powered observability into their workflows.

LLMSentiment Analysis+5
March 19, 2025
Comparison: Kubeadmiral and Karmada
Derek Mortimer15 Min Read
Comparison: Kubeadmiral and Karmada

A detailed comparison of Kubeadmiral and Karmada for multi-cluster Kubernetes management. This post explores their architectures, dynamic placement capabilities, and operational complexities to help you choose the right federation solution.

KubernetesMulti-cluster+5
March 17, 2025
Evaluating Large Scale Solutions for Multi Tenant Metrics System
Korhan Ozturk10 Min Read
Evaluating Large Scale Solutions for Multi Tenant Metrics System

A comprehensive evaluation of metrics solutions for multi-tenant Kubernetes platforms, comparing Prometheus + Thanos, Victoria Metrics, and Grafana Mimir to address scalability and resource efficiency challenges.

MetricsMulti-Tenancy+5
November 4, 2024
Supporting private service access in GCP from a multi-tenanted kubernetes platform
Tiago Alves4 Min Read
Supporting private service access in GCP from a multi-tenanted kubernetes platform

Explore how to support private service access in GCP from a multi-tenanted Kubernetes platform, comparing IAM Auth & Connectivity with Private Service Access (PSA) to help you choose the right solution for your infrastructure.

GCPKubernetes+5
October 25, 2024
Serverless Exodus to GKE Autopilot
Jingkai He6 Min Read
Serverless Exodus to GKE Autopilot

Explore the journey of migrating a high-traffic ad decision server from Cloud Run to GKE Autopilot. This post details the performance challenges with serverless, the benefits of a VM-based solution, and why GKE Autopilot became the ideal middle ground for scalability, cost-efficiency, and manageability.

GKE AutopilotCloud Run+5
September 13, 2024
Automated Landing Zones in GCP Organizations
Derek Mortimer13 Min Read
Automated Landing Zones in GCP Organizations

Learn how to automate Landing Zones in GCP Organizations.

GCPLanding Zones+1
August 12, 2024
How We Execute Greenfield Projects
Senna Semakula-Buuza6 Min Read
How We Execute Greenfield Projects

Learn our four-stage model for executing greenfield projects: discovery, planning, execution, and feedback. This post unveils our strategy for achieving high client satisfaction and making critical decisions efficiently.

Project ManagementGreenfield+5
July 15, 2024
Deep Dive into Policy Controllers and their impact on Cluster Management
Andreas Ttofi18 Min Read
Deep Dive into Policy Controllers and their impact on Cluster Management

A comprehensive comparison of Kubernetes policy engines including OPA Gatekeeper, Kyverno, Kubewarden, and JsPolicy, exploring their architectures, strengths, and use cases for enforcing organizational standards.

KubernetesPolicy Engines+6
July 3, 2024
Best-Practice Security, Automation & Operability, with mTLS
CECG Engineering10 Min Read
Best-Practice Security, Automation & Operability, with mTLS

Discover how we designed a robust authentication approach which can flexibly handle a diverse range of communication protocols and which scales efficiently.

SecuritymTLS+6
June 10, 2024
How onboarding at CECG is different
Ilia Chernov6 Min Read
How onboarding at CECG is different

A personal account of CECG's unique onboarding experience, featuring a comprehensive 1-3 month bootcamp that transforms software developers into platform engineers through hands-on IDP projects and mentorship.

OnboardingPlatform Engineering+4
May 20, 2024
Scaling an http stub for load testing
Sergei Sizov6 Min Read
Scaling an http stub for load testing

Learn how to scale an HTTP stub for high-performance load testing using WireMock in Kubernetes. This post covers strategies for horizontal scaling, handling dynamic mappings with StatefulSets, and configuring a load generator for effective non-functional testing.

Load TestingWireMock+5
April 25, 2024
Identity-based Authentication for a Developer Platform
Tomasz Bartosiewicz7 Min Read
Identity-based Authentication for a Developer Platform

Learn how we implemented identity-based authentication for a developer platform using Google Identity-Aware Proxy (IAP) on GKE. This post covers our technical approach, from ingress architecture to overcoming IAP limitations, to provide secure, seamless access to internal services.

AuthenticationGCP+6
April 19, 2024
Building the Foundation: Our Take on Training
Savvas Michael6 Min Read
Building the Foundation: Our Take on Training

Imagine acquiring sought-after engineering skills that could significantly boost your expertise and confidence, in a matter of weeks.

TrainingPlatform Engineering+6
March 29, 2024
Navigating the Maze: Challenges in Seeking Support in Big Tech Companies
Andreas Ttofi8 Min Read
Navigating the Maze: Challenges in Seeking Support in Big Tech Companies

Explore the challenges of seeking support in big tech companies and the strategies to enhance the support experience. This post delves into the core issues faced by support teams and users of Internal Development Platforms (IDPs), highlighting solutions like comprehensive training, proactive support, and community-driven innovations.

Tech SupportBig Tech+5
March 7, 2024
Interfaces for Internal Developer Platforms
Ilia Chernov5 Min Read
Interfaces for Internal Developer Platforms

Explore how Internal Developer Platforms (IDPs) streamline common development processes through interfaces like CLI tools, developer portals, and platform orchestrators. This post examines the pros and cons of each approach to help you optimize developer workflows.

IDPPlatform Engineering+5
March 6, 2024
Crossplane: the good, the bad and the ugly
Simon Aquino10 Min Read
Crossplane: the good, the bad and the ugly

A comprehensive review of Crossplane after one year of intensive professional use, exploring its strengths in infrastructure automation and Kubernetes integration, alongside its challenges and limitations.

CrossplaneInfrastructure as Code+3
February 27, 2024
Case Study: Seamless Cross-cloud Application Deployments
Derek Mortimer15 Min Read
Case Study: Seamless Cross-cloud Application Deployments

Learn how CECG helped a client achieve seamless cross-cloud deployments between AWS and GCP, enabling teams to deploy workloads with just one line of YAML while building automated infrastructure pipelines.

Case StudyMulti-Cloud+5
February 14, 2024
Here’s how the 1000 users of the developer platform got introduced to autoscaling
Phivos Phivou5 Min Read
Here’s how the 1000 users of the developer platform got introduced to autoscaling

Learn how we successfully introduced 1000+ platform users to Horizontal Pod Autoscaling through an interactive knowledge platform with hands-on learning modules.

Platform EngineeringKubernetes+4
February 6, 2024
AWS Landing Zone: The Art of Taking Off with a Low Code Solution
Senna Semakula-Buuza8 Min Read
AWS Landing Zone: The Art of Taking Off with a Low Code Solution

Discover streamlining landing zone creation from the ground up using a low-code approach, optimising efficiency and reducing development complexities.

AWSLanding Zone+5
February 1, 2024
Embracing Innovation in CI/CD Pipelines: A Shift From Traditional Practices
Neofytos Zacharia12 Min Read
Embracing Innovation in CI/CD Pipelines: A Shift From Traditional Practices

Explore emerging trends in CI/CD pipelines that challenge conventional processes, advocating for script-based approaches, local execution capabilities, and tools like Dagger for more dynamic and adaptable workflows.

CI/CDDevOps+4
January 31, 2024
Case Study: Security for a large multinational media provider's Internal Developer Platform
Tiago Alves5 Min Read
Case Study: Security for a large multinational media provider's Internal Developer Platform

The client is a large multinational that operates in different parts of the world with different products, requiring a flexible solution with configurable rules and integrations per region.

Case StudySecurity+6
January 8, 2024
Platform Lifecycle Management with Flux & Concourse
Thoiba Thoudam4 Min Read
Platform Lifecycle Management with Flux & Concourse

Explore how to implement robust Platform Lifecycle Management using FluxCD for GitOps and Concourse for delivery management. This post details an automated, reliable, and continuous approach to deploying platform services across multiple environments.

Platform EngineeringGitOps+5
January 3, 2024
Unravelling Kubernetes Networking: A Comparative Guide to Choosing the Best CNI
Andreas Ttofi12 Min Read
Unravelling Kubernetes Networking: A Comparative Guide to Choosing the Best CNI

A comprehensive guide comparing different Container Network Interfaces (CNIs) in Kubernetes, including Cilium, Calico, Weave, and Flannel, with practical insights on CNI chaining and real-world applications.

KubernetesCNI+4
December 29, 2023
Integrating Kubernetes and Vault: The options
Tomasz Bartosiewicz11 Min Read
Integrating Kubernetes and Vault: The options

Explore four common mechanisms for integrating Kubernetes with HashiCorp Vault for secret management. This post compares the External Secrets Operator, Kubernetes Secrets Store CSI Driver, Vault Secrets Operator, and Vault Agent, weighing their pros and cons to help you choose the right solution for your platform.

KubernetesVault+5
December 21, 2023
What’s the point of Operators and CRDs? A seasonal reflection
Geoff Macartney7 Min Read
What’s the point of Operators and CRDs? A seasonal reflection

Explore the purpose and value of Kubernetes Operators and CRDs through a seasonal reflection. This post explains how Operators extend the Kubernetes control plane to manage both internal and external resources, simplifying complex application deployments and integrations.

KubernetesOperators+5
December 18, 2023
Why we are building our own Developer Platform
Christopher Batey6 Min Read
Why we are building our own Developer Platform

CECG was founded by and is made up of, the most senior software engineers that want to get things done quickly.

Platform EngineeringDeveloper Platform+5
December 5, 2023
Multi-Tenant Ingress for a GKE-based Developer Platform
Christopher Batey5 Min Read
Multi-Tenant Ingress for a GKE-based Developer Platform

Learn how to implement a multi-tenant ingress for a GKE-based developer platform, enabling developers to expose services to the internet seamlessly. This post details a tried-and-tested architecture using Gateway API, Cert Manager, and Traefik to automate DNS, TLS, and load balancing.

GKEKubernetes+5
December 1, 2023
Mastering the Google Cloud Professional DevOps Exam: The Influence of Platform Engineering Excellence
Neofytos Zacharia8 Min Read
Mastering the Google Cloud Professional DevOps Exam: The Influence of Platform Engineering Excellence

A comprehensive journey through the Google Cloud Professional DevOps Exam, exploring how CECG's platform engineering training and real-world project experience provided the practical foundation needed for certification success.

Google CloudDevOps+5
November 21, 2023
10 + 1 Things I wish I knew about operators before I wrote one
Christopher O’Quinn8 Min Read
10 + 1 Things I wish I knew about operators before I wrote one

A guide to writing your first Kubernetes Operator, covering 11 essential things to know before you start. This post offers practical advice on using the Operator SDK, handling reconciliation loops, managing state, and testing in isolation to save you time and effort.

KubernetesOperators+5
November 17, 2023
Upgrading Kubernetes: 8 years of production
Matt Burgess8 Min Read
Upgrading Kubernetes: 8 years of production

Learn from 8 years of experience in upgrading multi-tenanted production Kubernetes clusters. This post details the challenges of keeping clusters up to date, from managing API deprecations to aligning with vendor support schedules, and provides a recommended upgrade timeline to ensure a smooth, business-as-usual process.

KubernetesUpgrades+5
November 8, 2023
Continuous Load: Why and What
Robert Moss8 Min Read
Continuous Load: Why and What

Learn how Continuous Load helps monitor network health proactively by running 24/7 network load across infrastructure, enabling teams to find and fix problems before users notice them.

MonitoringInfrastructure+4
October 23, 2023
Exploring Multi-tenancy in Kubernetes: Benefits, Approaches, and Considerations
Neofytos Zacharia11 Min Read
Exploring Multi-tenancy in Kubernetes: Benefits, Approaches, and Considerations

Explore the benefits and challenges of multi-tenancy in Kubernetes, with a detailed comparison of different models like multiple clusters, multiple control planes, and shared control planes. This post dives into frameworks such as Vcluster, Kamaji, HNC, and Capsule to help you choose the right approach for your organization.

KubernetesMulti-tenancy+5
September 25, 2023
Security for an MVP Internal Developer Platform at a Retail Bank
Tiago Alves7 Min Read
Security for an MVP Internal Developer Platform at a Retail Bank

Learn how to implement security for an MVP Internal Developer Platform in a retail bank, covering secrets management, access control, vulnerability scanning, and network isolation. This post details a pragmatic approach to building a secure, scalable, and compliant platform from the ground up.

SecurityMVP+5
June 27, 2023
How to monitor an MVP Kubernetes-based Developer Platform with SLOs
Jingkai He7 Min Read
How to monitor an MVP Kubernetes-based Developer Platform with SLOs

Learn how to monitor an MVP Kubernetes-based developer platform using SLOs and SLIs. This post outlines a structured approach to defining measurable reliability targets for the control plane, data plane, networking, and load balancing to ensure platform stability and tenant satisfaction.

KubernetesSLO+5
June 13, 2023
Why do we use ADRs?
Tomasz Bartosiewicz8 Min Read
Why do we use ADRs?

Learn why we use ADRs and how they can help your team.

ArchitectureDocumentation+1
June 7, 2023
We are a Google Cloud Partner!
CECG Engineering2 Min Read
We are a Google Cloud Partner!

CECG has been appointed as a Google Cloud Partner, reaffirming our team's expertise in enterprise cloud transformations and GCP products after achieving required certifications and completing Google's onboarding process.

Google CloudPartnership+3
March 1, 2021
Using kind to test our Kubernetes Cassandra Operator
Sebastien Bonnet15 Min Read
Using kind to test our Kubernetes Cassandra Operator

How would you test a Kubernetes operator? We figured we would never be truly confident unless we ran the tests against a Kubernetes cluster using kind.

KubernetesTesting+6
September 25, 2020
Blogs - CECG