ACME Protocol Deep Dive: Surviving the 90-Day Certificate Lifespan

The era of manual certificate management is officially coming to a close. With Google Chromium’s "Moving Forward, Together" initiative proposing a reduction in the maximum validity of public TLS certi...

Tim Henrich
April 08, 2026
7 min read
53 views

ACME Protocol Deep Dive: Surviving the 90-Day Certificate Lifespan

The era of manual certificate management is officially coming to a close. With Google Chromium’s "Moving Forward, Together" initiative proposing a reduction in the maximum validity of public TLS certificates from 398 days to a mere 90 days, the industry is facing a massive operational shift.

When certificates expire every three months, manual generation, deployment, and tracking are no longer mathematically or operationally viable. The Automated Certificate Management Environment (ACME) protocol is transitioning from a "nice-to-have" DevOps best practice into a mandatory survival tool.

In this deep dive, we will explore the inner workings of the ACME protocol, examine why enterprise teams are abandoning HTTP-01 challenges, provide real-world implementation code, and explain why automating issuance is only half the battle.


What is the ACME Protocol?

Defined in RFC 8555, the ACME protocol was originally developed by the Internet Security Research Group (ISRG) for Let’s Encrypt. It provides a standardized, automated mechanism for validating domain ownership, requesting Certificate Signing Requests (CSRs), and issuing X.509 certificates.

At its core, ACME is a client-server architecture operating over HTTPS. The ACME client (running on your server, load balancer, or Kubernetes cluster) communicates with the ACME server (operated by the Certificate Authority) using JSON Web Signatures (JWS) to ensure the integrity and authenticity of all requests.

The 7-Step Issuance Workflow

To truly master certificate automation, you need to understand the underlying handshake between your infrastructure and the Certificate Authority (CA). Here is the exact lifecycle of an ACME transaction:

  1. Account Creation: The ACME client generates a public/private key pair (the Account Key) and registers an account with the CA. This key is used to sign all subsequent requests.
  2. Order Creation: The client submits an "order" requesting a certificate for one or more specific domains (e.g., api.example.com).
  3. Challenge Issuance: The CA responds with a cryptographic token and a set of "challenges." These challenges are tests the client must pass to prove cryptographic control over the requested domain.
  4. Challenge Response: The client provisions the required challenge response (usually by creating a specific file on a web server or a specific DNS record) and notifies the CA that it is ready for validation.
  5. Validation: The CA reaches out across the public internet to verify the challenge. If the CA successfully reads the token, domain ownership is verified.
  6. CSR Submission: The client generates a new private key (the Certificate Key) and a Certificate Signing Request (CSR), sending the CSR to the CA.
  7. Issuance: The CA signs the CSR, issues the certificate, and provides a download URL. The client downloads the certificate and the CA's intermediate chain.

Choosing Your Challenge: Why DNS-01 is the Enterprise Standard

When the CA issues a challenge in Step 3, ACME clients typically choose between two primary validation methods: HTTP-01 and DNS-01. While HTTP-01 is the default for many tutorials, enterprise DevOps teams are overwhelmingly migrating to DNS-01. Here is why.

The HTTP-01 Challenge

The client proves control by placing a file containing the token at a specific HTTP endpoint: http://<domain>/.well-known/acme-challenge/<token>.

  • The Pros: Extremely easy to set up on a single monolithic web server.
  • The Cons: It requires port 80 to be open to the internet. It cannot be used to issue wildcard certificates (*.example.com). Furthermore, it fails completely in split-horizon DNS environments where internal servers need public certificates but cannot accept inbound connections from the public CA.

The DNS-01 Challenge (Recommended)

The client proves control by creating a specific DNS TXT record: _acme-challenge.<domain> containing a base64-encoded hash of the token and account key.

  • The Pros: It does not require exposing any web servers to the internet (perfect for internal microservices, databases, and VPN endpoints). It fully supports wildcard certificates. Because the CA queries the public DNS system rather than making an HTTP request to your server, it gracefully bypasses complex firewall rules and NAT configurations.
  • The Cons: It requires programmatic access to your DNS provider's API.

The TLS-ALPN-01 Challenge

A third option, TLS-ALPN-01, proves control via a TLS handshake using Application-Layer Protocol Negotiation. While it avoids the port 80 requirement of HTTP-01 (using port 443 instead), it is complex to implement and requires terminating TLS directly at the ACME client. It is generally reserved for specialized load balancers and ingress controllers.


Implementing ACME in the Real World

Let's look at how to implement ACME using the DNS-01 challenge in two common enterprise scenarios: a traditional Linux VM and a Kubernetes cluster.

Scenario 1: Traditional VM using Certbot and AWS Route53

Certbot is the EFF's standard ACME client. To use the DNS-01 challenge with AWS Route53, you need to install Certbot and its Route53 plugin.

# Install Certbot and the Route53 DNS plugin
sudo apt-get update
sudo apt-get install certbot python3-certbot-dns-route53

# Ensure your AWS credentials are secure and scoped strictly to Route53
export AWS_ACCESS_KEY_ID="your_scoped_access_key"
export AWS_SECRET_ACCESS_KEY="your_scoped_secret_key"

# Request the certificate using DNS-01
sudo certbot certonly \
  --dns-route53 \
  --email admin@example.com \
  --agree-tos \
  -d "internal-api.example.com" \
  -d "*.example.com"

Best Practice Note: Never give your ACME client global DNS administrator rights. Use scoped API tokens or IAM roles that are restricted by the principle of least privilege, allowing the client to only create and delete TXT records under the _acme-challenge prefix.

Scenario 2: Kubernetes using cert-manager

In cloud-native environments, cert-manager is the undisputed industry standard. It runs as a Kubernetes controller, treating certificates as native Custom Resource Definitions (CRDs).

Here is an example of a ClusterIssuer configured to use Let's Encrypt with a Cloudflare DNS-01 challenge:

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    # The ACME server URL
    server: https://acme-v02.api.letsencrypt.org/directory
    email: admin@example.com
    privateKeySecretRef:
      name: letsencrypt-prod-account-key
    solvers:
    - dns01:
        cloudflare:
          email: admin@example.com
          apiTokenSecretRef:
            name: cloudflare-api-token-secret
            key: api-token

Once applied, any Kubernetes Ingress resource can simply request a certificate via annotations, and cert-manager will automatically handle the entire ACME workflow, store the resulting certificate in a Kubernetes Secret, and mount it to your pods.


The "Automation is Not Deployment" Trap

One of the most dangerous assumptions in DevOps is believing that a successful ACME renewal means your infrastructure is secure.

Consider the massive outages experienced by companies like Epic Games in 2022 and Starlink in 2023. These outages weren't caused by a lack of access to certificates; they were caused by expired certificates actively serving traffic.

Silent automation failures happen when the ACME client successfully downloads the new certificate to the disk, but the web server (Nginx, Apache, HAProxy) is never instructed to reload the new files into memory.

If you are using a standard CLI client like Certbot, you must utilize deployment hooks to ensure the application actually consumes the new certificate:

sudo certbot renew \
  --deploy-hook "systemctl reload nginx && systemctl restart postfix"

Modern web servers like Caddy and Traefik solve this by building the ACME client directly into the web server itself, dynamically provisioning and loading certificates into memory on-the-fly without external cron jobs or reload scripts.


Modern ACME Developments (2024-2025)

The ACME protocol is actively evolving to handle the scale of the modern internet. Two major developments are currently reshaping how clients and servers interact.

1. ACME Renewal Information (ARI)

Historically, ACME clients guessed when to renew a certificate—usually hardcoded to trigger when a certificate had 30 days of validity remaining.

With the introduction of ACME Renewal Information (ARI), the CA can now dynamically signal to the client the optimal time to renew. This allows CAs like Let's Encrypt to smooth out traffic spikes on their infrastructure. More importantly, in the event of a mass-revocation incident (where a CA must revoke millions of certificates due to a compliance bug), ARI allows the CA to gracefully stagger renewal signals to clients, preventing a "thundering herd" that could take down the CA's API.

2. Post-Quantum Cryptography (PQC) Integration

Following NIST's finalization of Post-Quantum Cryptography standards (FIPS 203, 204, 205) in late 2024, CAs are actively testing hybrid certificates that combine classical algorithms (RSA/ECC) with quantum-resistant algorithms (like ML-KEM). Because PQC keys and signatures are significantly larger, the ACME protocol and its underlying JSON payloads are being updated to handle increased bandwidth and processing requirements.


Best Practices for Bulletproof Certificate Management

To survive the transition to 90-day certificates, your ACME implementation must be resilient. Follow these core best practices:

1. Always Test in Staging

Public CAs enforce strict rate limits. If your CI/CD pipeline repeatedly requests real certificates during automated testing, you will quickly find your domain blocked for a week. Always configure your ACME clients to use the CA's staging environment for testing. For Let's Encrypt, this is https://acme-staging-v02.api.letsencrypt.org/directory.

2. Enforce CAA Records

Certificate Authority Authorization (CAA) DNS records restrict which CAs are allowed to

Share This Insight

Related Posts