GitOps with Argo CD: The Reconciliation Loop That Survives 3 a.m.

Here’s the test I use for any deployment tooling:

It is 3 a.m. on a Sunday. PagerDuty just woke you up. A production service is degraded. You roll out of bed, open your laptop, and have to figure out what the cluster thinks is true, what’s actually true, and what changed in the last twelve hours. The faster you can answer those three questions, the better the tooling.

The right GitOps stack collapses all three questions into one dashboard. The wrong one has you SSH-hopping between five servers running kubectl rollout history against unlabeled deployments. I’ve done both. I’m writing this post about the former.

The stack: Argo CD 3.4 for delivery and reconciliation, Helm 4 for packaging, Git as the audit log. Plus a thoughtful secrets story, because secrets are where GitOps quietly betrays unprepared teams.

GitOps, briefly, in case we’re not aligned

GitOps is two things, neither of which is “use Git for deployment”:

  1. Git is the desired-state source of truth. Not the dashboard, not the cluster, not a wiki page someone updated four months ago. The YAML in a repository — and only the YAML in that repository — describes what the cluster is supposed to look like.
  2. A reconciliation loop in the cluster pulls that state and continuously makes the cluster match. If someone kubectl edits a deployment at 2 a.m. and changes an image tag by hand, the loop either reverts the change or alerts that the cluster has drifted. Either way, you find out about it.

That’s it. Everything else — Helm, Kustomize, ApplicationSets, image automation, secret operators — is plumbing around those two ideas.

Argo CD 3.4: the reconciliation engine I run

Argo CD’s pitch is simple: point it at a Git repository, point it at a cluster, and it makes the cluster look like the repository, continuously, forever. What that simple pitch hides is a UI and a controller architecture that quietly answers most of the questions you’ll ever ask about a Kubernetes deployment.

v3.4 GA’d in early May 2026 and brings a few things I’d been waiting for:

  • Pause Reconciliation per cluster. A single switch in the UI that tells the controller to stop trying to reconcile a specific target cluster while you’re investigating something there. The number of incidents I’ve extended by 30 minutes because Argo helpfully kept reverting my emergency debugging changes — that number can now be zero.
  • Progressive Sync for ApplicationSets. When you’re deploying the same Application across 80 clusters via an ApplicationSet, you almost never want all 80 to start syncing at the same instant. Progressive Sync lets you stage the rollout — canary cluster first, then a wave of 5, then everything else — with the blast radius of a single misconfigured chart capped at one cluster instead of the whole fleet.
  • PreDelete hooks. Cleanup logic that runs before an application is removed, declaratively, in the manifest. Replaces a category of “I forgot to run the database backup before uninstalling” incidents.
  • Faster repo-server. The 3.4 release ships meaningful performance work on the component that does the actual templating and manifest rendering. For shops with hundreds of Applications, the difference is visible.

But the feature that earns Argo CD its place isn’t in any release notes. It’s the resource graph view in the UI. Click into an Application, see every Kubernetes resource it manages, see their sync status and health in real time, click into any of them, see the live manifest, the desired manifest, and the diff between them side by side. At 3 a.m., that view collapses the answer to what is the cluster actually doing right now into a single browser tab. Helm couldn’t do that. kubectl plus jq couldn’t do that. The Argo dashboard does.

Argo CD vs Flux vs Rancher Fleet — quick, honest

PropertyArgo CDFlux v2Rancher Fleet
Control modelCentralized control planeDecentralized, in-cluster controllersCentralized, integrated with Rancher
UIFirst-class, the best in categoryMinimal (Weave GitOps add-on exists)Integrated in Rancher UI
CNCF statusGraduatedGraduated(Vendor-led, not CNCF)
Scale ceilingHundreds–low thousands of appsHigh thousandsDesigned for ~1M clusters
Image automation built-inVia Argo CD Image UpdaterFirst-classLimited
SOPS / encrypted secretsVia pluginsFirst-classVia plugins
Helm supportExcellentExcellentExcellent
Multi-tenancyAppProjects (good)Tenants (good)Workspaces (good, Rancher-flavored)
Best forApp delivery + visual opsPure CLI + maximum modularityFleet of RKE2 clusters at scale

Honest pick:

  • For most teams, Argo CD’s UI advantage matters more than Flux’s modularity advantage. The reconciliation engines are both excellent; the visual tooling around them is the differentiator.
  • For GitOps purists who never want to leave the CLI and want to compose every primitive themselves, Flux is the more elegant answer. The team I respect most working in pure-CLI Kubernetes runs Flux for this reason.
  • For anyone already on RKE2 managing more than a handful of clusters, Rancher Fleet is already on your servers and worth a serious look. It’s the right default if you’ve already bet on the Rancher ecosystem.

The stack I actually run

On my 3-node bare-metal RKE2 cluster:

  • Argo CD 3.4 as the cluster’s reconciliation controller. One Argo CD per cluster, not a centralized hub-and-spoke. The hub pattern is fine; the per-cluster pattern survives a network partition better.
  • App-of-Apps pattern. A single root Application points at a Git path that contains other Argo CD Application manifests. Bootstrap the cluster with one kubectl apply, and Argo brings up everything else.
  • Helm 4 for the upstream charts (Cert-Manager, Ingress, Prometheus stack, External-DNS, NetBird when I’m self-hosting the coordination server).
  • Kustomize overlays for everything that’s mine — internal apps, custom CRDs, anything where I’d rather edit a strategic-merge patch than fight Helm’s templating.
  • Sealed Secrets (Bitnami) for secret material in Git. SOPS is the cooler choice; Sealed Secrets is the simpler one. I run the simpler one.
  • Argo Rollouts for canary deployments on the three or four services where a bad release would be visible. The rest deploy as vanilla RollingUpdate and that’s fine.

The whole control surface is reachable through the cluster’s NetBird mesh — Argo’s UI is exposed internally only, no public ingress, no Cloudflare Tunnel, no “oops we forgot to enable RBAC” embarrassments.

What this stack does at 3 a.m.

When something breaks:

  1. Argo CD UI tells me the application is OutOfSync or Degraded. I see immediately what changed, by whom, when, diffed against the desired state.
  2. Git history tells me what changed in the desired state. Last commit, last PR, who approved it.
  3. If the desired state is wrong, I revert the commit. Argo CD reconciles within a minute. Production is back.
  4. If the desired state is right but reality is wrong (a node crashed, a PVC got into ReadOnly mode, whatever), I pause reconciliation on the affected cluster (thank you, 3.4), fix the underlying problem, unpause.
  5. At no point do I need to remember which deployment was in which namespace, which Helm release name maps to which chart, or whether someone kubectl edit-ed something three weeks ago and forgot to update the repo.

That’s it. That’s the whole pitch. The tooling earns its place by making 3 a.m. survivable, not by being interesting at 10 a.m. when nothing is on fire.

Closing

GitOps isn’t a religion. It’s a reconciliation loop and a discipline about where state lives. Argo CD 3.4 is the rare piece of operations tooling that genuinely improves your worst day. Paired with Helm 4 for packaging and Git for the audit trail, it’s the stack I’d put on every production Kubernetes cluster I’m responsible for — and have.

The version numbers will move on. The architecture won’t.