GitOps for Certificate Lifecycle Management: Surviving the 90-Day TLS Mandate
The intersection of GitOps and Certificate Lifecycle Management (CLM) has officially transitioned from an optional architectural best practice to an absolute necessity.
Driven by the exponential growth of machine identities—which now outnumber human identities by a staggering 45:1 ratio—and the looming reality of shortened certificate lifespans, manual certificate management is no longer viable. When Google's Chrome Root Program announced its intention to reduce the maximum validity of public TLS certificates from 398 days to just 90 days, it sent a clear message to the industry: automate, or face continuous outages.
Coupled with the need for "crypto-agility" ahead of Post-Quantum Cryptography (PQC) standards recently finalized by NIST, DevOps and security teams must rethink how they provision, deploy, and monitor trust. GitOps provides the declarative, automated, and highly auditable framework required to manage certificates at cloud-native scale.
In this comprehensive guide, we will explore why legacy CLM fails, how to architect a GitOps-driven certificate pipeline, and how to implement zero-touch provisioning using industry-standard tools like ArgoCD and cert-manager.
The Ticking Clock: Why Manual CLM is Obsolete
High-profile outages caused by expired certificates continue to plague the industry. In recent years, massive disruptions at Starlink, Cisco, and Microsoft (affecting Teams and Exchange) have demonstrated that even tech giants are vulnerable to expiration incidents. With Gartner estimating the average cost of IT downtime at $300,000 per hour, a forgotten TLS certificate is a multi-million dollar liability.
Legacy CLM workflows typically involve IT ticketing systems, manual generation of Certificate Signing Requests (CSRs), copy-pasting private keys, and manual deployment to load balancers or ingress controllers. This approach suffers from three fatal flaws:
- Human Error and Configuration Drift: Certificates deployed manually often drift from their documented state. A renewed certificate might be applied to the primary load balancer but forgotten on the failover unit.
- Inability to Scale: When public certificates expire every 90 days, a company with just 100 public-facing services will need to process more than one certificate renewal per day.
- Lack of Crypto-Agility: If a Certificate Authority (CA) is compromised, or when the time comes to migrate to quantum-resistant algorithms, manually replacing thousands of certificates across distributed infrastructure is an impossible task.
The GitOps Paradigm Shift
GitOps is an operational framework that takes DevOps best practices used for application development—version control, collaboration, compliance, and CI/CD—and applies them to infrastructure automation.
In a GitOps CLM model, the Git repository becomes the single source of truth for your certificate infrastructure. Instead of manually interacting with a CA or a Kubernetes cluster, engineers declare the desired state of their certificates in Git. Software agents running inside the cluster continuously monitor this repository, automatically pulling changes and reconciling the live state of the infrastructure to match the desired state.
Core Architecture: Separation of Concerns
A successful GitOps CLM implementation relies on a strict separation of concerns to maintain security:
- Git holds the Request: The Git repository contains YAML manifests defining what the certificate should look like (domains, issuer, algorithms). It never contains the private key.
- The Cluster holds the Key: The Kubernetes cluster generates the private key internally and stores it securely in a native Secret construct.
- The CA holds the Trust: An external or internal Certificate Authority (e.g., Let's Encrypt, HashiCorp Vault, Venafi) verifies the request and signs the certificate.
[ Git Repository ] ---> (ArgoCD/Flux detects commit) ---> [ Kubernetes Cluster ]
- Certificate.yaml - Generates Private Key
- Issuer.yaml - Creates CSR
|
v
[ Certificate Authority ] <--- (cert-manager sends CSR) -----------+
- Validates Request
- Signs Certificate
- Returns Signed Cert to Cluster
Implementing GitOps CLM: A Technical Tutorial
To demonstrate this workflow, we will build a declarative CLM pipeline using cert-manager (the CNCF standard for Kubernetes certificate management), ArgoCD (a popular GitOps continuous delivery tool), and Let's Encrypt.
Step 1: Define the ClusterIssuer
First, we need to tell our cluster how to communicate with our Certificate Authority. We do this by defining a ClusterIssuer. Instead of applying this directly via kubectl, we commit this file to our Git repository.
# cluster-issuer.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
# The ACME server URL
server: https://acme-v02.api.letsencrypt.org/directory
# Email address used for ACME registration
email: security@yourdomain.com
# Name of a secret used to store the ACME account private key
privateKeySecretRef:
name: letsencrypt-prod-account-key
# Enable HTTP-01 challenges
solvers:
- http01:
ingress:
class: nginx
Step 2: Declare the Certificate
Next, developers define their certificate requirements alongside their application code. This is known as "Shift-Left PKI." The developer does not need to know how to generate a CSR; they only need to request the domains they need.
# app-certificate.yaml
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: api-gateway-cert
namespace: production
spec:
# The secret name where cert-manager will store the signed cert and private key
secretName: api-gateway-tls
# Certificate validity duration (e.g., 90 days)
duration: 2160h
# How long before expiration cert-manager should automatically renew (e.g., 30 days)
renewBefore: 720h
subject:
organizations:
- MyCorp Inc.
isCA: false
privateKey:
algorithm: RSA
encoding: PKCS1
size: 2048
dnsNames:
- api.yourdomain.com
- gateway.yourdomain.com
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
Step 3: GitOps Reconciliation with ArgoCD
Finally, we configure ArgoCD to monitor the Git repository containing these manifests. We create an ArgoCD Application resource.
# argocd-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: production-certificates
namespace: argocd
spec:
project: default
source:
repoURL: 'https://github.com/your-org/infrastructure-manifests.git'
path: certs/production
targetRevision: HEAD
destination:
server: 'https://kubernetes.default.svc'
namespace: production
syncPolicy:
automated:
prune: true
selfHeal: true
The Automated Workflow in Action
Once you commit these files and push them to your main branch, the following automated sequence occurs entirely hands-off:
- Detection: ArgoCD detects the new commit in Git.
- Application: ArgoCD applies the
ClusterIssuerandCertificatemanifests to the Kubernetes cluster. - Generation:
cert-managerdetects the newCertificateresource. It generates an RSA 2048-bit private key directly in the cluster's memory. - Issuance:
cert-managercreates a CSR and sends it to Let's Encrypt using the ACME protocol, automatically handling the HTTP-01 challenge via your NGINX ingress. - Fulfillment: Let's Encrypt signs the certificate.
cert-managerreceives it and creates a Kubernetes Secret namedapi-gateway-tlscontaining both the signed certificate and the private key. - Continuous Renewal: 30 days before the certificate expires,
cert-managerwill automatically wake up, generate a new private key, and repeat the entire signing process—ensuring zero downtime.
Addressing the Elephant: "Don't Put Secrets in Git"
The most common pushback against GitOps CLM is the valid security mandate: Never store private keys in Git.
It is crucial to understand the difference between declarative CLM and committing PEM files. In the workflow described above, no private key material ever touches the Git repository. Git only contains the instructions (the CRDs) on how to obtain the certificate. The actual cryptographic material is generated dynamically inside the secure boundary of the Kubernetes cluster.
If you have legacy systems that require you to inject existing, externally generated certificates into your cluster via GitOps, you must use secrets management tools. Projects like the External Secrets Operator (ESO) allow you to store the actual certificate in AWS Secrets Manager or Azure Key Vault. You then commit an ExternalSecret manifest to Git, which instructs the cluster to fetch the sensitive payload securely at runtime.
Alternatively, tools like Sealed Secrets allow you to safely commit one-way encrypted secrets to Git, which can only be decrypted by a private key held tightly by a controller inside the cluster.
Security, Compliance, and Crypto-Agility
Transitioning to a GitOps CLM approach provides massive benefits for security and compliance teams:
The Ultimate Audit Trail
Compliance frameworks like PCI-DSS v4.0 and the EU's NIS2 Directive mandate strict management and auditing of cryptographic assets. With GitOps, git log becomes your compliance report. You can instantly prove to auditors exactly who requested a certificate, who approved the pull request, when it was deployed, and what the exact configuration was.
Rapid Incident Response
Consider a scenario where an internal Certificate Authority is compromised. In a traditional environment, revoking and reissuing thousands of certificates could take weeks. In a GitOps environment, you simply update the issuerRef in your Git repository to point to a new, secure CA. ArgoCD instantly syncs the change, and cert-manager automatically reissues and rotates every single certificate in your infrastructure within minutes.
Post-Quantum Readiness
As NIST finalizes PQC standards (FIPS 203, 204, and 205), organizations will soon need to swap their RSA and ECC certificates for quantum-resistant algorithms. GitOps enables "crypto-agility." Changing the privateKey.algorithm field in a Helm chart or Kustomize base and merging a single pull request is all it takes to upgrade your entire fleet to post-quantum cryptography.
Trust, but Verify: The Role of External Monitoring
While GitOps and tools like cert-manager provide incredibly robust automation, automation can—and eventually will—fail.
What happens if Let'