Discover insights, tutorials, and thoughts from our community. Stay updated with the latest trends in platform engineering, DevOps, and software development.

Learn how to serve large language models on multiple Kubernetes nodes using sig LWS and vLLM. This guide covers the challenges of multi-node inference, the architecture of LeaderWorkerSet, and practical tips for deployment, observability, and efficient model loading.

Explore the journey of migrating a high-traffic ad decision server from Cloud Run to GKE Autopilot. This post details the performance challenges with serverless, the benefits of a VM-based solution, and why GKE Autopilot became the ideal middle ground for scalability, cost-efficiency, and manageability.

Learn how to monitor an MVP Kubernetes-based developer platform using SLOs and SLIs. This post outlines a structured approach to defining measurable reliability targets for the control plane, data plane, networking, and load balancing to ensure platform stability and tenant satisfaction.