Comparing GitOps: Argo CD vs Flux CD, with Andrei Kvapil – KubeFM – Podcast

Episodes

Dear friend, you have built a Kubernetes, with Mac Chaffee
24 Jun· KubeFM
Mac Chaffee, a platform engineer and security champion, examines why developers often underestimate the complexity of running modern applications and how overconfidence leads to expensive technical mistakes.
You will learn:
Why teams reject Kubernetes then rebuild it piece by piece - understanding the psychological factors, like overconfidence, that drive initial rejection of complex but proven tools
How to identify the tipping point when DIY solutions become more complex than adopting established orchestration tools, especially around scaling and high availability challenges
The right approach to abstracting Kubernetes complexity - why hiding the Kubernetes API often backfires and how to build effective guardrails instead of reinventing interfaces
Why mentorship gaps lead to poor technical decisions - how the lack of proper apprenticeship programs in tech results in teams making expensive mistakes when building infrastructure
Sponsor
This episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/9nFPmG85f
Interested in sponsoring an episode? Learn more.
- Listen Listen again Continue Playing...
- Listen later Listen later
Beyond Kubernetes: Serverless Execution Models for Variable Workloads, with Marc Campora
17 Jun· KubeFM
Marc Campora, a systems consultant with experience in high-throughput platforms, shares his analysis of a real customer deployment with 500+ microservices. He breaks down the cost implications, technical constraints, and operational trade-offs between Kubernetes containers and AWS Lambda functions based on actual production data and migration assessments.
You will learn:
Cost analysis frameworks for comparing Lambda vs Kubernetes across different traffic patterns, including specific examples of 3x savings potential and the 80/20 rule for service utilization
Migration complexity factors when moving existing microservices to Lambda, including cold start issues, runtime model changes, and why it's often a complete rewrite rather than a simple port
Decision criteria for choosing between platforms based on traffic consistency, computational requirements, and operational overhead tolerance
Sponsor
This episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/5gMTkzLhV
Interested in sponsoring an episode? Learn more.
- Listen Listen again Continue Playing...
- Listen later Listen later
Missing episodes?

Click here to refresh the feed.
Shared Nothing, Shared Everything: The Truth About Kubernetes Multi-Tenancy, with Molly Sheets
10 Jun· KubeFM
Molly Sheets, Director of Engineering for Kubernetes at Zynga, discusses her team's approach to platform engineering. She explains why their initial one-cluster-per-team model became unsustainable and how they're transitioning to multi-tenant architectures.
You will learn:
Why slowing down deployments actually increases risk and how manual approval gates can make systems less resilient than faster, smaller deployments
The operational reality of cluster proliferation - why managing hundreds of clusters becomes unsustainable and when multi-tenancy becomes necessary
Practical multi-tenancy implementation strategies including resource quotas, priority classes, and namespace organization patterns that work in production
Better metrics for multi-tenant environments - why control plane uptime doesn't matter and how to build meaningful SLOs for distributed platform health
Sponsor
This episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/Rmpl8948_
Interested in sponsoring an episode? Learn more.
- Listen Listen again Continue Playing...
- Listen later Listen later
My pipelines from GitLab Commit to ArgoCD got beaten by FTP, with David Pech
3 Jun· KubeFM
A sophisticated GitLab CI/CD pipeline integrated with Argo CD was ultimately rejected in favour of simple FTP deployment, offering crucial insights into the real barriers facing cloud-native adoption in traditional organisations.
David Pech, Staff Cloud Ops Engineer at Wrike and holder of all CNCF certifications, shares his experience supporting a PHP team after a company merger. He details how he built a complete cloud-native platform with Kubernetes, Helm charts, and GitOps workflows, only to see it fail against cultural and organizational resistance despite its technical superiority.
You will learn:
The hidden costs of sophisticated tooling - How GitOps pipelines with multiple moving parts can create trust issues when developers lose local control and must rely on remote processes they don't understand
Cultural factors that trump technical benefits - Why customer expectations, existing Windows-based infrastructure, and team readiness matter more than the elegance of your Kubernetes solution
Practical strategies for incremental adoption - The importance of starting small, building in-house operational expertise, and ensuring management advocacy at all levels before attempting cloud-native transformations
Sponsor
This episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/_MWX5m6G_
Interested in sponsoring an episode? Learn more.
- Listen Listen again Continue Playing...
- Listen later Listen later
Performance testing Kubernetes workloads, with Stephan Schwarz
27 May· KubeFM
If you're tasked with performance testing Kubernetes workloads without much guidance, this episode offers clear, experience-based strategies that go beyond theory.
Stephan Schwarz, a DevOps engineer at iits-consulting, walks through his systematic approach to performance testing Kubernetes applications. He covers everything from defining what performance actually means, to the practical methodology of breaking individual pods to understand their limits, and navigating the complexities of Kubernetes-specific components that affect test results.
You will learn:
How to establish baseline performance metrics by systematically testing individual pods, disabling autoscaling features, and documenting each incremental change to understand real application limits
Why shared Kubernetes components skew results and how ingress controllers, service meshes, and monitoring stacks create testing challenges that require careful consideration of the entire request chain
Practical approaches to HPA configuration, including how to account for scaling latency, the time delays inherent in Kubernetes scaling operations, and planning for spare capacity based on your SLA requirements
The role of observability tools like OpenTelemetry in production environments where load testing isn't feasible, and how distributed tracing helps isolate performance bottlenecks across interdependent services
Sponsor
This episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/yY-FnmGfH
Interested in sponsoring an episode? Learn more.
- Listen Listen again Continue Playing...
- Listen later Listen later
Managing 100s of Kubernetes Clusters using Cluster API, with Zain Malik
20 May· KubeFM
Discover how to manage Kubernetes at scale with declarative infrastructure and automation principles.
Zain Malik shares his experience managing multi-tenant Kubernetes clusters with up to 30,000 pods across clusters capped at 950 nodes. He explains how his team transitioned from Terraform to Cluster API for declarative cluster lifecycle management, contributing upstream to improve AKS support while implementing GitOps workflows.
You will learn:
How to address challenges in large-scale Kubernetes operations, including node pool management inconsistencies and lengthy provisioning times
Why Cluster API provides a powerful foundation for multi-cloud cluster management, and how to extend it with custom operators for production-specific needs
How implementing GitOps principles eliminates manual intervention in critical operations like cluster upgrades
Strategies for handling production incidents and bugs when adopting emerging technologies like Cluster API
Sponsor
This episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/5PLksqVlk
Interested in sponsoring an episode? Learn more.
- Listen Listen again Continue Playing...
- Listen later Listen later
Super-Scaling Open Policy Agent with Batch Queries, with Nicholaos Mouzourakis
13 May· KubeFM
Dive into the technical challenges of scaling authorization in Kubernetes with this in-depth conversation about Open Policy Agent (OPA).
Nicholaos Mouzourakis, Staff Product Security Engineer at Gusto, explains how his team re-architected Kubernetes native authorization using OPA to support scale, latency guarantees, and audit requirements across services. He shares detailed insights about their journey optimizing OPA performance through batch queries and solving unexpected interactions between Kubernetes resource limits and Go's runtime behavior.
You will learn:
Why traditional authorization approaches (code-driven and data-driven) fall short in microservice architectures, and how OPA provides a more flexible, decoupled solution
How batch authorization can improve performance by up to 18x by reducing network round-trips
The unexpected interaction between Kubernetes CPU limits and Go's thread management (GOMAXPROCS) that can severely impact OPA performance
Practical deployment strategies for OPA in production environments, including considerations for sidecars, daemon sets, and WASM modules
Sponsor
This episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/S-2vQ_j-4
Interested in sponsoring an episode? Learn more.
- Listen Listen again Continue Playing...
- Listen later Listen later
Kubernetes upgrades: beyond the one-click update, with Tanat Lokejaroenlarb
6 May· KubeFM
Discover how Adevinta manages Kubernetes upgrades at scale in this episode with Tanat Lokejaroenlarb. Tanat shares his team's journey from time-consuming blue-green deployments to efficient in-place upgrades for their multi-tenant Kubernetes platform SHIP, detailing the engineering decisions and operational challenges they overcame.
You will learn:
How to transition from blue-green to in-place Kubernetes upgrades while maintaining service reliability
Techniques for tracking and addressing API deprecations using tools like Pluto and Kube-no-trouble
Strategies for minimizing SLO impact during node rebuilds through serialized approaches and proper PDB configuration
Why a phased upgrade approach with "cluster waves" provides safer production deployments even with thorough testing
Sponsor
This episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/VVHFfXGl_
Interested in sponsoring an episode? Learn more.
- Listen Listen again Continue Playing...
- Listen later Listen later
From Fragile to Faultless: Kubernetes Self-Healing In Practice, with Grzegorz Głąb
29 Apr· KubeFM
Discover how to build resilient Kubernetes environments at scale with practical automation strategies from an engineer who's tackled complex production challenges.
Grzegorz Głąb, Kubernetes Engineer at Cloud Kitchens, shares his team's journey developing a comprehensive self-healing framework. He explains how they addressed issues ranging from spot node preemptions to network packet drops caused by unbalanced IRQs, providing concrete examples of automation that prevents downtime and improves reliability.
You will learn:
How managed Kubernetes services like AKS provide benefits but require customization for specific use cases
The architecture of an effective self-healing framework using DaemonSets and deployments with Kubernetes-native components
Practical solutions for common challenges like StatefulSet pods stuck on unreachable nodes and cleaning up orphaned pods
Techniques for workload-level automation, including throttling CPU-hungry pods and automating diagnostic data collection
Sponsor
This episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/yg_fkP0LN
Interested in sponsoring an episode? Learn more.
- Listen Listen again Continue Playing...
- Listen later Listen later
Replacing StatefulSets with a custom Kubernetes operator in our Postgres cloud platform, with Andrew Charlton
22 Apr· KubeFM
Discover why standard Kubernetes StatefulSets might not be sufficient for your database workloads and how custom operators can provide better solutions for stateful applications.
Andrew Charlton, Staff Software Engineer at Timescale, explains how they replaced Kubernetes StatefulSets with a custom operator called Popper for their PostgreSQL Cloud Platform. He details the technical limitations they encountered with StatefulSets and how their custom approach provides more intelligent management of database clusters.
You will learn:
Why StatefulSets fall short for managing high-availability PostgreSQL clusters, particularly around pod ordering and volume management
How Timescale's instance matching approach solves complex reconciliation challenges when managing heterogeneous database workloads
The benefits of implementing discrete, idempotent actions rather than workflows in Kubernetes operators
Real-world examples of operations that became possible with their custom operator, including volume downsizing and availability zone consolidation
Sponsor
This episode is brought to you by mirrord — run local code like in your Kubernetes cluster without deploying first.
More info
Find all the links and info for this episode here: https://ku.bz/fhZ_pNXM3
Interested in sponsoring an episode? Learn more.
- Listen Listen again Continue Playing...
- Listen later Listen later
Saving 10s of thousands of dollars deploying AI at scale with Kubernetes, with John McBride
18 Mar· KubeFM
Curious about running AI models on Kubernetes without breaking the bank? This episode delivers practical insights from someone who's done it successfully at scale.
John McBride, VP of Infrastructure and AI Engineering at the Linux Foundation shares how his team at OpenSauced built StarSearch, an AI feature that uses natural language processing to analyze GitHub contributions and provide insights through semantic queries. By using open-source models instead of commercial APIs, the team saved tens of thousands of dollars.
You will learn:
How to deploy VLLM on Kubernetes to serve open-source LLMs like Mistral and Llama, including configuration challenges with GPU drivers and daemon sets
Why smaller models (7-14B parameters) can achieve 95% effectiveness for many tasks compared to larger commercial models, with proper prompt engineering
How running inference workloads on your own infrastructure with T4 GPUs can reduce costs from tens of thousands to just a couple thousand dollars monthly
Practical approaches to monitoring GPU workloads in production, including handling unpredictable failures and VRAM consumption issues
Sponsor
This episode is brought to you by StackGen! Don't let infrastructure block your teams. StackGen deterministically generates secure cloud infrastructure from any input - existing cloud environments, IaC or application code.
More info
Find all the links and info for this episode here: https://ku.bz/wP6bTlrFs
Interested in sponsoring an episode? Learn more.
- Listen Listen again Continue Playing...
- Listen later Listen later
Learned it the hard way: don't use Cilium's default Pod CIDR, with Isala Piyarisi
25 Feb· KubeFM
This episode examines how a default configuration in Cilium CNI led to silent packet drops in production after 8 months of stable operations.
Isala Piyarisi, Senior Software Engineer at WSO2, shares how his team discovered that Cilium's default Pod CIDR (10.0.0.0/8) was conflicting with their Azure Firewall subnet assignments, causing traffic disruptions in their staging environment.
You will learn:
How Cilium's default CIDR allocation can create routing conflicts with existing infrastructure
A methodical process for debugging network issues using packet tracing, routing table analysis, and firewall logs
The procedure for safely changing Pod CIDR ranges in production clusters
Sponsor
This episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/kJjXQlmTw
Interested in sponsoring an episode? Learn more.
- Listen Listen again Continue Playing...
- Listen later Listen later
Simplifying Kubernetes deployments with a unified Helm chart, with Calin Florescu
18 Feb· KubeFM
Managing microservices in Kubernetes at scale often leads to inconsistent deployments and maintenance overhead. This episode explores a practical solution that standardizes service deployments while maintaining team autonomy.
Calin Florescu discusses how a unified Helm chart approach can help platform teams support multiple development teams efficiently while maintaining consistent standards across services.
You will learn:
Why inconsistent Helm chart configurations across teams create maintenance challenges and slow down deployments
How to implement a unified Helm chart that balances standardization with flexibility through override functions
How to maintain quality through automated documentation and testing with tools like Helm Docs and Helm unittest
Sponsor
This episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/mcPtH5395
Interested in sponsoring an episode? Learn more.
- Listen Listen again Continue Playing...
- Listen later Listen later
5,000 pods/second and 60% utilization with Gödel and Katalyst, with Yue Yin
4 Feb· KubeFM
Learn how ByteDance manages computing resources at scale with custom Kubernetes scheduling solutions that handle millions of pods across thousands of nodes.
Yue Yin, Software Engineer at ByteDance, discusses their open-source Gödel scheduler and Katalyst resource management system. She explains how these tools address the challenges of managing online and offline workloads in large-scale Kubernetes deployments.
You will learn:
How Gödel's distributed architecture with dispatcher, scheduler, and binder components enables the scheduling of 5,000 pods per second
Why NUMA-aware scheduling and two-layer architecture are crucial for handling complex workloads at scale
How Katalyst provides node-level resource insights to enable efficient workload co-location and improve CPU utilization
Sponsor
This episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/lMpNng_33
Interested in sponsoring an episode? Learn more.
- Listen Listen again Continue Playing...
- Listen later Listen later
Black box vs white box observability in Kubernetes, with Artem Lajko
28 Jan· KubeFM
Platform Engineer Artem Lajko breaks down observability into three distinct layers and explains how tools like Prometheus, Grafana, and Falco serve different purposes. He also shares practical insights on implementing the right level of monitoring based on team requirements and capabilities.
You will learn:
How to implement the three-layer model (external, internal, and OS-level) and why each layer serves different stakeholders
How to choose and scale observability tools using a label-based approach (low, medium, high)
How to manage observability costs by collecting only relevant metrics and logs
Sponsor
This episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/9sGxhmm8s
Interested in sponsoring an episode? Learn more.
- Listen Listen again Continue Playing...
- Listen later Listen later
Exploring multi-tenancy for my Kubernetes learning platform, with Stefan Roman
10 Dec 2024· KubeFM
Stefan Roman shares his experience building Labs4Grabs, a platform that gives students root access to Kubernetes clusters. He discusses the journey from evaluating simple namespace-based isolation to implementing full VM-based isolation with KubeVirt.
You will learn:
Why namespace isolation isn't sufficient for untrusted users and the limitations of tools like vCluster when running privileged workloads.
How to use KubeVirt to achieve complete workload isolation and the trade-offs.
Practical approaches to implementing network security with NetworkPolicies and managing resource allocation across multiple student environments.
Follow Stefan's journey from simple to complex isolation strategies, focusing on the technical decisions and trade-offs he encountered.
Sponsor
This episode is sponsored by Kusari — gain complete visibility into your software components and secure your supply chain through comprehensive tracking and analysis.
More info
Find all the links and info for this episode here: https://ku.bz/Xz-TrmX2F
Interested in sponsoring an episode? Learn more.
- Listen Listen again Continue Playing...
- Listen later Listen later
Optimize the Kubernetes dev experience by creating silos, with Michael Levan
3 Dec 2024· KubeFM
Michael Levan explains how specialized teams and smart abstractions can lead to better outcomes. Drawing from cognitive science and his experience in platform engineering, Michael presents practical strategies for building effective engineering organizations.
You will learn:
Why specialized teams (or "silos") can improve productivity and why the real enemy is ego, not specialization.
How to use Internal Developer Platforms (IDPs) and abstractions to empower teams without requiring everyone to be a Kubernetes expert.
How to balance specialization and collaboration using platform engineering practices and smart abstractions
Practical strategies for managing cognitive load in engineering teams and why not everyone needs to know YAML.
Sponsor
This episode is brought to you by Testkube — scale all of your tests with Kubernetes, integrate seamlessly with CI/CD and centralize test troubleshooting and reporting.
More info
Find all the links and info for this episode here: https://ku.bz/qlZPfM-zr
Interested in sponsoring an episode? Learn more.
- Listen Listen again Continue Playing...
- Listen later Listen later
Rebuilding my homelab: suffering as service, with Xe iaso
19 Nov 2024· KubeFM
Xe Iaso shares their journey in building a "compute as a faucet" home lab where infrastructure becomes invisible and tasks can be executed without manual intervention. The discussion covers everything from operating system selection to storage architecture and secure access patterns.
You will learn:
How to evaluate operating systems for your home lab — from Rocky Linux to Talos Linux, and why minimal, immutable operating systems are gaining traction.
How to implement a three-tier storage strategy combining Longhorn (replicated storage), NFS (bulk storage), and S3 (cloud storage) to handle different workload requirements.
How to secure your home lab with certificate-based authentication, WireGuard VPN, and proper DNS configuration while protecting your home IP address.
Sponsor
This episode is sponsored by Nutanix — innovate faster with a complete and open cloud-native stack for all your apps and data anywhere.
More info
Find all the links and info for this episode here: https://ku.bz/2kzj2MgfH
Interested in sponsoring an episode? Learn more.
- Listen Listen again Continue Playing...
- Listen later Listen later
The hater's guide to Kubernetes, with Paul Butler
12 Nov 2024· KubeFM
If you're trying to make sense of when to use Kubernetes and when to avoid it, this episode offers a practical perspective based on real-world experience running production workloads.
Paul Butler, founder of Jamsocket, discusses how to identify necessary vs unnecessary complexity in Kubernetes and explains how his team successfully runs production workloads by being selective about which features they use.
You will learn:
The three compelling reasons to use Kubernetes are managing multiple services across machines, defining infrastructure as code, and leveraging built-in redundancy.
Why to be cautious with features like CRDs, StatefulSets, and Helm and how to evaluate if you really need them.
How to stay on the "happy path" in Kubernetes by focusing on stable and simple resources like Deployments, Services, and ConfigMaps.
When to consider alternatives like Google Cloud Run for simpler deployments that don't need the full complexity of Kubernetes
Sponsor
This episode is sponsored by Syntasso, the creators of Kratix, a framework for building composable internal developer platforms
More info
Find all the links and info for this episode here: https://ku.bz/VB-0WYqtb
Interested in sponsoring an episode? Learn more.
- Listen Listen again Continue Playing...
- Listen later Listen later
Kubernetes webhooks explained and Aspect Oriented Programming, with Gordon Myers
5 Nov 2024· KubeFM
This episode explores Admission Controllers and Webhooks with Gordon Myers, who shares his experience implementing webhook solutions in production. Gordon explains the lifecycle of Kubernetes API requests and how webhooks can intercept and modify resources before they are stored in etcd.
You will learn:
How the Kubernetes API processes requests through authentication, authorization, and Admission Controllers.
The difference between Validating and Mutating webhooks and how to implement them using JSON Patch.
Best practices for testing webhooks and avoiding common pitfalls that can break cluster deployments.
Real-world examples of webhook implementations, including injecting secrets from HashiCorp Vault into containers.
Sponsor
This episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/Dmn93dd7M
Interested in sponsoring an episode? Learn more.
- Listen Listen again Continue Playing...
- Listen later Listen later
Show more

Episodes

Dear friend, you have built a Kubernetes, with Mac Chaffee

Beyond Kubernetes: Serverless Execution Models for Variable Workloads, with Marc Campora

Shared Nothing, Shared Everything: The Truth About Kubernetes Multi-Tenancy, with Molly Sheets

My pipelines from GitLab Commit to ArgoCD got beaten by FTP, with David Pech

Performance testing Kubernetes workloads, with Stephan Schwarz

Managing 100s of Kubernetes Clusters using Cluster API, with Zain Malik

Super-Scaling Open Policy Agent with Batch Queries, with Nicholaos Mouzourakis

Kubernetes upgrades: beyond the one-click update, with Tanat Lokejaroenlarb

From Fragile to Faultless: Kubernetes Self-Healing In Practice, with Grzegorz Głąb

Replacing StatefulSets with a custom Kubernetes operator in our Postgres cloud platform, with Andrew Charlton

Saving 10s of thousands of dollars deploying AI at scale with Kubernetes, with John McBride

Learned it the hard way: don't use Cilium's default Pod CIDR, with Isala Piyarisi

Simplifying Kubernetes deployments with a unified Helm chart, with Calin Florescu

5,000 pods/second and 60% utilization with Gödel and Katalyst, with Yue Yin

Black box vs white box observability in Kubernetes, with Artem Lajko

Exploring multi-tenancy for my Kubernetes learning platform, with Stefan Roman

Optimize the Kubernetes dev experience by creating silos, with Michael Levan

Rebuilding my homelab: suffering as service, with Xe iaso

The hater's guide to Kubernetes, with Paul Butler

Kubernetes webhooks explained and Aspect Oriented Programming, with Gordon Myers