GitOps with Argo CD: The Reconciliation Loop That Survives 3 a.m.
Here’s the test I use for any deployment tooling:
It is 3 a.m. on a Sunday. PagerDuty just woke you up. A production service is degraded. You roll out of bed, open your laptop, and have to figure out what the cluster thinks is true, what’s actually true, and what changed in the last twelve hours. The faster you can answer those three questions, the better the tooling.
The right GitOps stack collapses all three questions into one
dashboard. The wrong one has you SSH-hopping between five servers
running kubectl rollout history against unlabeled deployments. I’ve
done both. I’m writing this post about the former.
The stack: Argo CD 3.4 for delivery and reconciliation, Helm 4 for packaging, Git as the audit log. Plus a thoughtful secrets story, because secrets are where GitOps quietly betrays unprepared teams.
GitOps, briefly, in case we’re not aligned
GitOps is two things, neither of which is “use Git for deployment”:
- Git is the desired-state source of truth. Not the dashboard, not the cluster, not a wiki page someone updated four months ago. The YAML in a repository — and only the YAML in that repository — describes what the cluster is supposed to look like.
- A reconciliation loop in the cluster pulls that state and
continuously makes the cluster match. If someone
kubectl edits a deployment at 2 a.m. and changes an image tag by hand, the loop either reverts the change or alerts that the cluster has drifted. Either way, you find out about it.
That’s it. Everything else — Helm, Kustomize, ApplicationSets, image automation, secret operators — is plumbing around those two ideas.
Argo CD 3.4: the reconciliation engine I run
Argo CD’s pitch is simple: point it at a Git repository, point it at a cluster, and it makes the cluster look like the repository, continuously, forever. What that simple pitch hides is a UI and a controller architecture that quietly answers most of the questions you’ll ever ask about a Kubernetes deployment.
v3.4 GA’d in early May 2026 and brings a few things I’d been waiting for:
- Pause Reconciliation per cluster. A single switch in the UI that tells the controller to stop trying to reconcile a specific target cluster while you’re investigating something there. The number of incidents I’ve extended by 30 minutes because Argo helpfully kept reverting my emergency debugging changes — that number can now be zero.
- Progressive Sync for ApplicationSets. When you’re deploying the same Application across 80 clusters via an ApplicationSet, you almost never want all 80 to start syncing at the same instant. Progressive Sync lets you stage the rollout — canary cluster first, then a wave of 5, then everything else — with the blast radius of a single misconfigured chart capped at one cluster instead of the whole fleet.
- PreDelete hooks. Cleanup logic that runs before an application is removed, declaratively, in the manifest. Replaces a category of “I forgot to run the database backup before uninstalling” incidents.
- Faster repo-server. The 3.4 release ships meaningful performance work on the component that does the actual templating and manifest rendering. For shops with hundreds of Applications, the difference is visible.
But the feature that earns Argo CD its place isn’t in any release
notes. It’s the resource graph view in the UI. Click into an
Application, see every Kubernetes resource it manages, see their
sync status and health in real time, click into any of them, see the
live manifest, the desired manifest, and the diff between them side
by side. At 3 a.m., that view collapses the answer to what is the
cluster actually doing right now into a single browser tab. Helm
couldn’t do that. kubectl plus jq couldn’t do that. The Argo
dashboard does.
Argo CD vs Flux vs Rancher Fleet — quick, honest
| Property | Argo CD | Flux v2 | Rancher Fleet |
|---|---|---|---|
| Control model | Centralized control plane | Decentralized, in-cluster controllers | Centralized, integrated with Rancher |
| UI | First-class, the best in category | Minimal (Weave GitOps add-on exists) | Integrated in Rancher UI |
| CNCF status | Graduated | Graduated | (Vendor-led, not CNCF) |
| Scale ceiling | Hundreds–low thousands of apps | High thousands | Designed for ~1M clusters |
| Image automation built-in | Via Argo CD Image Updater | First-class | Limited |
| SOPS / encrypted secrets | Via plugins | First-class | Via plugins |
| Helm support | Excellent | Excellent | Excellent |
| Multi-tenancy | AppProjects (good) | Tenants (good) | Workspaces (good, Rancher-flavored) |
| Best for | App delivery + visual ops | Pure CLI + maximum modularity | Fleet of RKE2 clusters at scale |
Honest pick:
- For most teams, Argo CD’s UI advantage matters more than Flux’s modularity advantage. The reconciliation engines are both excellent; the visual tooling around them is the differentiator.
- For GitOps purists who never want to leave the CLI and want to compose every primitive themselves, Flux is the more elegant answer. The team I respect most working in pure-CLI Kubernetes runs Flux for this reason.
- For anyone already on RKE2 managing more than a handful of clusters, Rancher Fleet is already on your servers and worth a serious look. It’s the right default if you’ve already bet on the Rancher ecosystem.
The stack I actually run
On my 3-node bare-metal RKE2 cluster:
- Argo CD 3.4 as the cluster’s reconciliation controller. One Argo CD per cluster, not a centralized hub-and-spoke. The hub pattern is fine; the per-cluster pattern survives a network partition better.
- App-of-Apps pattern. A single root Application points at a Git
path that contains other Argo CD Application manifests. Bootstrap
the cluster with one
kubectl apply, and Argo brings up everything else. - Helm 4 for the upstream charts (Cert-Manager, Ingress, Prometheus stack, External-DNS, NetBird when I’m self-hosting the coordination server).
- Kustomize overlays for everything that’s mine — internal apps, custom CRDs, anything where I’d rather edit a strategic-merge patch than fight Helm’s templating.
- Sealed Secrets (Bitnami) for secret material in Git. SOPS is the cooler choice; Sealed Secrets is the simpler one. I run the simpler one.
- Argo Rollouts for canary deployments on the three or four
services where a bad release would be visible. The rest deploy as
vanilla
RollingUpdateand that’s fine.
The whole control surface is reachable through the cluster’s NetBird mesh — Argo’s UI is exposed internally only, no public ingress, no Cloudflare Tunnel, no “oops we forgot to enable RBAC” embarrassments.
What this stack does at 3 a.m.
When something breaks:
- Argo CD UI tells me the application is
OutOfSyncorDegraded. I see immediately what changed, by whom, when, diffed against the desired state. - Git history tells me what changed in the desired state. Last commit, last PR, who approved it.
- If the desired state is wrong, I revert the commit. Argo CD reconciles within a minute. Production is back.
- If the desired state is right but reality is wrong (a node crashed, a PVC got into ReadOnly mode, whatever), I pause reconciliation on the affected cluster (thank you, 3.4), fix the underlying problem, unpause.
- At no point do I need to remember which deployment was in
which namespace, which Helm release name maps to which chart, or
whether someone
kubectl edit-ed something three weeks ago and forgot to update the repo.
That’s it. That’s the whole pitch. The tooling earns its place by making 3 a.m. survivable, not by being interesting at 10 a.m. when nothing is on fire.
Closing
GitOps isn’t a religion. It’s a reconciliation loop and a discipline about where state lives. Argo CD 3.4 is the rare piece of operations tooling that genuinely improves your worst day. Paired with Helm 4 for packaging and Git for the audit trail, it’s the stack I’d put on every production Kubernetes cluster I’m responsible for — and have.
The version numbers will move on. The architecture won’t.