The 90-Day Countdown: Automating Certificate Management Across Multi-Cloud Environments

In April 2024, a major outage struck Cisco’s Duo multi-factor authentication service, locking users out of enterprise systems globally. The culprit? A single expired TLS certificate. Just months prior...

Tim Henrich
April 21, 2026
6 min read
6 views

The 90-Day Countdown: Automating Certificate Management Across Multi-Cloud Environments

In April 2024, a major outage struck Cisco’s Duo multi-factor authentication service, locking users out of enterprise systems globally. The culprit? A single expired TLS certificate. Just months prior, a similar expired ground-station certificate caused a global outage for Starlink users.

According to Keyfactor’s 2024 State of Machine Identity Report, 81% of organizations have experienced at least one certificate-related outage in the past 24 months, with the average outage costing over $300,000 per hour.

Managing Public Key Infrastructure (PKI) and machine identities has always been a high-stakes game. But as modern infrastructure sprawls across AWS, Azure, Google Cloud Platform (GCP), and on-premises Kubernetes clusters, the volume of certificates is exploding. Machine identities now outnumber human identities by a staggering ratio of 45:1.

If your team is still relying on calendar reminders, spreadsheets, or siloed cloud dashboards to track SSL certificates, your infrastructure is living on borrowed time. In this comprehensive guide, we will explore the forcing functions reshaping certificate management in 2024-2025, why native cloud tools fall short, and how to architect a fully automated, crypto-agile certificate lifecycle across multi-cloud environments.


The Forcing Functions: Why 2024-2025 Changes Everything

Two major industry shifts are rendering manual certificate management mathematically impossible for enterprise-scale environments.

1. Google’s 90-Day TLS Validity Push

Through its "Moving Forward, Together" initiative, Google has formally proposed reducing the maximum validity of public TLS certificates from 398 days to just 90 days. While the CA/Browser Forum has not yet mandated this as a strict rule, the cybersecurity industry is treating it as an inevitability for late 2025 or 2026.

When certificates expire every 90 days, human intervention is no longer an option. You cannot manually generate Certificate Signing Requests (CSRs), undergo validation, and deploy certificates across hundreds of load balancers and microservices four times a year without making catastrophic errors. Automation via the ACME (Automated Certificate Management Environment) protocol is now a baseline survival requirement.

2. NIST Post-Quantum Cryptography (PQC) Standards

In August 2024, NIST finalized the first three Post-Quantum Cryptography standards (FIPS 203, 204, and 205). As quantum computing advances, traditional RSA and ECC algorithms will eventually be broken. Organizations must now build "crypto-agility"—the architectural ability to rapidly discover and swap out cryptographic algorithms across all environments without downtime. If you do not have a centralized, automated way to rotate certificates today, migrating to PQC standards tomorrow will be a multi-year nightmare.

3. DORA Compliance

Taking effect in the EU in January 2025, the Digital Operational Resilience Act (DORA) requires financial entities to maintain strict control over ICT risks. This explicitly includes the management of cryptographic keys and certificates to prevent operational disruptions. Failing to track and automate certificate renewals will soon be a regulatory violation, not just an operational risk.


The Multi-Cloud Illusion: Why Native Tools Aren't Enough

Most organizations start their cloud journey using native tools: AWS Certificate Manager (ACM), Azure Key Vault, or Google Cloud Certificate Authority Service (CAS).

The problem? These tools do not talk to each other.

If you run a frontend on AWS, a data lake on GCP, and legacy Active Directory workloads on Azure, relying on native tools creates massive visibility silos. Security teams lack a "single pane of glass," leading to shadow IT where developers provision their own certificates using non-compliant cryptographic standards. Furthermore, relying entirely on a specific cloud provider's PKI introduces severe vendor lock-in, making it incredibly difficult to migrate workloads or establish mutual TLS (mTLS) trust boundaries between different clouds.

To solve this, organizations must adopt a paradigm of Centralized Visibility, Decentralized Execution.

You need an agnostic layer that sits above the cloud providers. This is where dedicated external monitoring tools come into play. Using a platform like Expiring.at provides that crucial independent safety net. Even if your automation fails, or a developer manually deploys a rogue certificate outside your CI/CD pipeline, centralized tracking ensures you receive alerts before an expiration takes your services offline.


Tutorial: Automating Multi-Cloud PKI with cert-manager and HashiCorp Vault

To achieve true multi-cloud certificate automation, DevOps teams are standardizing on HashiCorp Vault as the centralized internal Certificate Authority (CA), and cert-manager as the execution engine inside Kubernetes clusters.

Here is a practical, step-by-step guide to setting up this architecture.

Step 1: Configure HashiCorp Vault as your Internal Root CA

First, we need to enable the PKI secrets engine in Vault and generate a root certificate. This Vault instance should ideally be hosted centrally, accessible by all your multi-cloud clusters.

# Enable the PKI secrets engine
vault secrets enable pki

# Tune the engine to allow certificates up to 87600 hours (10 years for the Root CA)
vault secrets tune -max-lease-ttl=87600h pki

# Generate the Root CA
vault write -field=certificate pki/root/generate/internal \
     common_name="example.com Internal Root CA" \
     ttl=87600h > root_ca.crt

Next, configure a role that defines the parameters for the certificates cert-manager will be allowed to request:

# Create a role allowing subdomains of example.com, with a 30-day max TTL
vault write pki/roles/kubernetes-workloads \
     allowed_domains="example.com" \
     allow_subdomains=true \
     max_ttl="720h"

Step 2: Deploy and Configure cert-manager in Kubernetes

cert-manager acts as a universal abstraction layer. Developers simply request a certificate via a standard Kubernetes manifest, and cert-manager handles the complex API calls to Vault (or Let's Encrypt for public endpoints).

First, install cert-manager using Helm:

helm repo add jetstack https://charts.jetstack.io
helm repo update
helm install cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --set installCRDs=true

Next, create a ClusterIssuer resource. This tells cert-manager how to authenticate with your central HashiCorp Vault instance. (Note: In production, use Kubernetes Service Account authentication rather than a static token).

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: vault-issuer
spec:
  vault:
    server: https://vault.central-ops.example.com:8200
    path: pki/sign/kubernetes-workloads
    auth:
      kubernetes:
        role: cert-manager
        mountPath: /v1/auth/kubernetes
        secretRef:
          name: issuer-token
          key: token

Step 3: Automate Certificate Provisioning for Workloads

Now, your developers don't need to know anything about Vault, CSRs, or PKI. When they deploy a new microservice (e.g., in AWS EKS or Azure AKS), they simply attach annotations to their Kubernetes Ingress resource.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: billing-service-ingress
  annotations:
    # This tells cert-manager to automatically provision a cert using Vault
    cert-manager.io/cluster-issuer: "vault-issuer"
spec:
  rules:
  - host: billing.internal.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: billing-service
            port:
              number: 80
  tls:
  - hosts:
    - billing.internal.example.com
    # cert-manager will automatically create this secret and keep the cert rotated
    secretName: billing-service-tls

With this architecture, certificates are automatically requested, provisioned, stored as Kubernetes secrets, and renewed before they expire—entirely invisible to the developer.


Implementing Service Mesh for Zero Trust (mTLS)

While the above method handles ingress traffic, multi-cloud Zero Trust Architecture (ZTA) requires continuous authentication of all resources. This means pod-to-pod communication must be secured via mutual TLS (mTLS).

Managing mTLS certificates manually at the application level is impossible. Instead, deploy a service mesh like Istio across your clusters. Istio injects a sidecar proxy (Envoy) into every pod. The Istio control plane automatically provisions highly ephemeral certificates (valid for just hours) to these proxies and rotates them continuously.

To enforce strict mTLS across an entire multi-cloud cluster with Istio, you simply apply a PeerAuthentication policy:

```

Share This Insight

Related Posts