Automate and Conquer: A GitOps Guide to Certificate Lifecycle Management

Certificate-related outages are the silent killers of uptime. They strike without warning, are often caused by simple human error, and can bring down critical services for hours. With the industry rap...

Tim Henrich
January 17, 2026
7 min read
24 views

Automate and Conquer: A GitOps Guide to Certificate Lifecycle Management

Certificate-related outages are the silent killers of uptime. They strike without warning, are often caused by simple human error, and can bring down critical services for hours. With the industry rapidly moving towards 90-day certificate lifespans, the old ways of managing TLS—spreadsheets, calendar reminders, and manual shell scripts—are no longer just inefficient; they are a direct threat to your business.

The culprit is complexity. Modern cloud-native environments, with their microservices, service meshes, and dynamic infrastructure, can have thousands of certificates. Manually tracking, renewing, and deploying each one is an impossible task. This is where GitOps comes in.

By treating your certificate management process as code, you can build a fully automated, auditable, and scalable system that eliminates manual toil and prevents outages before they happen. This guide will walk you through the principles, tools, and a step-by-step implementation for mastering certificate lifecycle management (CLM) with a GitOps approach.

Why Traditional Certificate Management Fails at Scale

For years, organizations have struggled with certificate management. A 2023 report from Keyfactor found that 73% of organizations still experience unexpected outages due to expired certificates. The problem is poised to get much worse.

Starting in late 2024, major browsers like Google Chrome will enforce a 90-day maximum validity period for public TLS certificates. This shift means the renewal cycle will accelerate by a factor of four. Any process that relies on a human remembering to perform a task is doomed to fail.

The common pain points include:

  • Operational Toil: Developers file tickets and wait days for a security team to provision a certificate, slowing down release cycles.
  • Lack of Visibility: "Certificate sprawl" makes it impossible to maintain an accurate inventory. Where are all our certificates? Who owns them? When do they expire? This lack of a central source of truth is a primary cause of unexpected expirations. While a GitOps approach creates an inventory for your Kubernetes assets, a comprehensive solution like Expiring.at provides a single pane of glass across your entire organization, covering both modern and legacy infrastructure.
  • Inconsistent Configurations: Without enforced standards, different teams might request certificates with weak key algorithms or insecure settings, creating security gaps.
  • Security Risks: Private keys are often handled manually, passed around in emails or chat messages, or stored insecurely, dramatically increasing the attack surface.

GitOps directly addresses these challenges by applying battle-tested software development practices—version control, peer review, and automation—to infrastructure and security management.

The GitOps Solution: Declarative, Automated, and Auditable CLM

GitOps is a paradigm for managing infrastructure and applications where Git is the single source of truth. The desired state of your system is declared in a Git repository, and an automated agent, or "controller," ensures the live environment matches that state.

When applied to certificate management, this model provides powerful benefits:

  1. A Declarative Inventory: Your Git repository becomes a complete, real-time inventory of every certificate you manage. A simple search can tell you exactly which certificates are deployed, what domains they cover, and which Certificate Authority (CA) issued them.
  2. Automated End-to-End Lifecycle: The entire process—from requesting a new certificate to handling renewal and even revocation—is automated. No manual intervention is required once the initial configuration is defined in Git.
  3. Auditability and Compliance: Every change to a certificate's configuration is a Git commit. This gives you a complete, immutable audit log. Who requested a wildcard certificate? When was the issuer changed? The git log has the answers.
  4. Developer Self-Service: Developers can request certificates for new services simply by adding a YAML file to their application's code and submitting a pull request. The security and operations teams become reviewers and approvers, not manual gatekeepers.

A Practical Guide: Building Your GitOps CLM Workflow

Let's build a production-ready certificate management system using a popular and powerful toolset: Kubernetes, cert-manager, and the Argo CD GitOps controller.

Prerequisites

  • A running Kubernetes cluster.
  • kubectl configured to connect to your cluster.
  • A Git repository (e.g., on GitHub or GitLab) to store your configuration.
  • A registered domain name and access to its DNS provider (we'll use AWS Route 53 in this example).

Step 1: Install the Core Components

First, we need to install cert-manager, the de facto standard for certificate management in Kubernetes, and Argo CD, our GitOps controller. The best practice is to manage these tools themselves via GitOps, but for simplicity, we'll install them using their official Helm charts.

Install Argo CD:

kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

Install cert-manager:

helm repo add jetstack https://charts.jetstack.io
helm repo update
helm install \
  cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --version v1.14.4 \
  --set installCRDs=true

Step 2: Configure a Declarative Issuer in Git

An Issuer or ClusterIssuer is a cert-manager resource that represents a Certificate Authority. A ClusterIssuer is a global resource available to all namespaces, perfect for a production CA like Let's Encrypt.

We will configure a ClusterIssuer to use the ACME protocol with a DNS01 challenge. The DNS01 challenge is highly recommended for production because it doesn't require exposing your services to the internet during the validation process.

Create a file named cluster-issuer.yaml in your Git repository:

# cluster-issuer.yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    # The ACME server URL for Let's Encrypt's production environment
    server: https://acme-v02.api.letsencrypt.org/directory
    # Email address used for ACME registration and renewal notifications
    email: your-admin-email@example.com
    # Name of a secret used to store the ACME account private key
    privateKeySecretRef:
      name: letsencrypt-prod-private-key
    # Configure the DNS01 challenge provider
    solvers:
    - dns01:
        route53:
          region: us-east-1
          # This role must have permissions to modify Route 53 records
          role: arn:aws:iam::ACCOUNT_ID:role/cert-manager-dns-solver

Commit this file to your Git repository. Now, configure Argo CD to sync this manifest to your cluster. You can do this by creating an Application resource.

Step 3: Request a Certificate Declaratively

With our issuer configured, requesting a certificate is as simple as defining a Certificate resource in Git. Let's say we want to secure an application running at app.your-domain.com.

Create a file named my-app-certificate.yaml in your application's configuration directory within your Git repo:

# my-app-certificate.yaml
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: my-app-tls
  namespace: my-app-ns
spec:
  # The name of the Kubernetes Secret to store the certificate and private key
  secretName: my-app-tls-secret
  # The domain name(s) to include in the certificate
  dnsNames:
  - app.your-domain.com
  # Reference to the ClusterIssuer we created earlier
  issuerRef:
    name: letsencrypt-prod
    kind: ClusterIssuer

Once you commit and push this file, the GitOps workflow kicks in:

  1. Argo CD detects the new Certificate manifest in Git and applies it to the my-app-ns namespace in your cluster.
  2. cert-manager's controller sees the new resource and begins the issuance process.
  3. It communicates with Let's Encrypt, creates the necessary DNS record in Route 53 to solve the DNS01 challenge, and validates ownership of the domain.
  4. Upon successful validation, Let's Encrypt issues the certificate.
  5. cert-manager creates a Kubernetes Secret named my-app-tls-secret containing the signed certificate (tls.crt) and the private key (tls.key).

Most importantly, cert-manager will now monitor this certificate. By default, it will automatically begin the renewal process 30 days before expiration, ensuring you never suffer an outage.

Step 4: Consume the Certificate in Your Application

The final step is to use the generated secret in your application. For a web service, this is typically done by mounting the secret into an Ingress controller.

Here is an example Ingress manifest:

# my-app-ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-app-ingress
  namespace: my-app-ns
  annotations:
    # Annotations specific to your ingress controller
    kubernetes.io/ingress.class: "nginx"
spec:
  tls:
  - hosts:
    - app.your-domain.com
    # Reference the secret created by cert-manager
    secretName: my-app-tls-secret
  rules:
  - host: app.your-domain.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: my-app-service
            port:
              number: 80

Commit this file to Git. Argo CD will apply it, and your Ingress controller will load the TLS certificate from the secret, enabling HTTPS for your application. The entire flow is now 100% automated and managed through Git.

Production-Ready Best Practices

A basic setup is a great start, but a production-grade system requires additional layers of security and policy.

Enforce Standards with Policy as Code

Use a policy engine like Kyverno or OPA Gatekeeper to enforce rules on Certificate resources before they are created. This prevents common misconfigurations.

Example Kyverno Policy: Disallow wildcard certificates and enforce a minimum key size.
```yaml
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: enforce-certificate-standards
spec:
validationFailureAction: Enforce
rules:
- name: disallow-wildcard-dnsnames
match:
any:
- resources:
kinds:
- Certificate
validate:

Share This Insight

Related Posts