The 90-Day Countdown: Certificate-Based Attack Vectors and How to Mitigate Them
Public Key Infrastructure (PKI) and digital certificates are no longer just about securing web traffic behind a green padlock. In modern infrastructure, they are the foundational trust mechanism for everything: microservices, API gateways, IoT devices, and containerized workloads.
According to recent industry research, machine identities now outnumber human identities by a staggering ratio of 45:1 in enterprise environments. While security teams spend millions deploying Multi-Factor Authentication (MFA) and Identity and Access Management (IAM) for their human workforce, the cryptographic keys and certificates governing machine-to-machine communication are often left under-managed, improperly stored, or tracked in fragile spreadsheets.
Threat actors have noticed. Modern adversaries rarely attempt to break the underlying mathematics of RSA or Elliptic Curve Cryptography (ECC). Instead, they target the trust established by certificates.
With Google Chrome's impending policy to reduce the maximum lifespan of public TLS certificates from 398 days to just 90 days, and the finalization of NIST’s Post-Quantum Cryptography (PQC) standards, the era of manual certificate management is officially over. Let's break down the primary certificate-based attack vectors modern DevOps and security teams face, and the technical blueprints required to mitigate them.
Top Certificate-Based Attack Vectors
To defend your infrastructure, you must understand how threat actors are weaponizing PKI. The following vectors represent the most critical threats to machine identity trust.
1. Private Key Theft and Supply Chain Compromise
The most direct way to bypass encryption is to steal the keys that facilitate it. Attackers actively scan for private keys left in poorly secured environments, such as exposed GitHub repositories, misconfigured AWS S3 buckets, or compromised developer endpoints.
If an attacker steals a private key associated with a valid TLS certificate, they can impersonate legitimate services, execute Adversary-in-the-Middle (AitM) attacks, or decrypt intercepted traffic (if Perfect Forward Secrecy is not enforced).
This vector becomes even more devastating when applied to Code Signing Certificates. In early 2024, threat actors breached AnyDesk’s production systems and stole their code-signing certificates. The attackers could then sign malicious payloads that inherently bypassed Windows SmartScreen, EDR solutions, and antivirus software because the operating system viewed the malware as a trusted, verified application.
2. Shadow PKI and Rogue Certificates
Agility is the lifeblood of DevOps. When centralized IT processes for issuing certificates take days or weeks, engineering teams will inevitably find workarounds. They spin up self-signed certificates or deploy unauthorized internal Certificate Authorities (CAs) using tools like OpenSSL to keep their CI/CD pipelines moving.
This creates "Shadow PKI." These rogue CAs are rarely secured in Hardware Security Modules (HSMs) or subjected to security audits. If a threat actor compromises a shadow CA on an internal network, they can issue mathematically valid certificates for any internal domain. This facilitates lateral movement and allows attackers to intercept internal microservice traffic without triggering alarms.
3. BGP Hijacking for Domain Validation
Automated Certificate Authorities like Let's Encrypt rely on the Automated Certificate Management Environment (ACME) protocol to issue certificates. To prove you own a domain, the CA sends a challenge—typically an HTTP-01 (placing a specific file on your web server) or DNS-01 (creating a specific TXT record) challenge.
Highly sophisticated attackers utilize Border Gateway Protocol (BGP) hijacking to manipulate internet routing tables. By briefly rerouting traffic destined for your IP address to a server they control, they can successfully answer the CA's HTTP-01 challenge. The CA then issues a perfectly valid, trusted certificate for your domain to the attacker, which is subsequently used in highly convincing phishing campaigns or traffic interception.
4. The Self-Inflicted Denial of Service: Certificate Expiry
While not a malicious attack initiated by a threat actor, certificate expiration is the most frequent and disruptive certificate-related incident. Over 80% of organizations have experienced at least one severe outage in the past 24 months due to expired certificates.
A prime example occurred in 2024 when an expired ground-station certificate caused a massive, hours-long global outage for Starlink users. When a certificate on a critical load balancer, database, or API gateway expires, the cryptographic handshake fails, resulting in an immediate, hard failure of the service. In a microservices architecture, a single expired certificate deep in the stack can trigger a cascading failure across the entire application.
The 2025 Catalysts: Why Manual Management is Dead
Two major industry shifts are forcing organizations to rethink their PKI strategies immediately.
1. Google’s 90-Day TLS Proposal
Google has announced its intention to reduce the maximum validity period for public TLS certificates from 398 days to 90 days. When this takes effect, tracking expirations in a spreadsheet and manually generating Certificate Signing Requests (CSRs) will become mathematically impossible for teams managing hundreds or thousands of domains. Total automation is no longer a luxury; it is a baseline operational requirement.
2. NIST Post-Quantum Cryptography (PQC) Standards
In August 2024, NIST finalized its PQC standards (FIPS 203, 204, and 205). While quantum computers capable of breaking RSA-2048 do not exist today, nation-state actors are currently executing "Harvest Now, Decrypt Later" attacks—stealing and storing encrypted traffic today to decrypt it when quantum computing matures. Organizations must achieve "Crypto-Agility," meaning they can rapidly swap out cryptographic algorithms and root CAs across their entire fleet without code changes.
A Blueprint for Mitigation and Crypto-Agility
To secure machine identities and prepare for the 90-day lifecycle, DevOps and security teams must implement a layered defense strategy focused on automation, visibility, and strict access controls.
1. Implement Total Automation with ACME
You must remove human beings from the certificate issuance and renewal lifecycle. The ACME protocol is the industry standard for this.
For Kubernetes environments, cert-manager is the de facto standard. It integrates seamlessly with Let's Encrypt for public certificates or HashiCorp Vault for internal PKI.
Here is an example of a ClusterIssuer configuration in Kubernetes that automates the issuance and renewal of Let's Encrypt certificates using DNS-01 validation (which is more secure than HTTP-01 and mitigates BGP hijacking risks):
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
# The ACME server URL
server: https://acme-v02.api.letsencrypt.org/directory
# Email address used for ACME registration
email: security@yourdomain.com
# Name of a secret used to store the ACME account private key
privateKeySecretRef:
name: letsencrypt-prod-account-key
# Enable the DNS-01 challenge provider
solvers:
- dns01:
cloudflare:
apiTokenSecretRef:
name: cloudflare-api-token-secret
key: api-token
With this deployed, cert-manager will automatically provision certificates for your Ingress resources and renew them 30 days before they expire, completely eliminating manual intervention.
2. Lock Down Issuance with DNS CAA Records
To prevent attackers from using a compromised or lenient CA to issue rogue certificates for your domain, you should implement Certificate Authority Authorization (CAA) records in your DNS.
A CAA record explicitly dictates which CAs are allowed to issue certificates for your domain. If an attacker tries to use a different CA, that CA is legally obligated by the CA/Browser Forum to check the CAA record and refuse issuance.
Add the following TXT-like record to your DNS zone file to restrict issuance exclusively to Let's Encrypt:
example.com. IN CAA 0 issue "letsencrypt.org"
example.com. IN CAA 0 issuewild "letsencrypt.org"
example.com. IN CAA 0 iodef "mailto:security@example.com"
Note: The iodef tag ensures that if a CA rejects an issuance request due to a CAA violation, they will send an incident report to your security team.
3. Enforce Certificate Transparency (CT) Monitoring
Certificate Transparency is an open framework that logs every public TLS certificate issued by a trusted CA into public, append-only cryptographic ledgers.
You must continuously monitor these logs to detect rogue certificates. If an attacker successfully provisions a certificate for yourcompany-login.com or manages to bypass your CAA records, CT logs are your only early warning system.
You can manually search your domains using tools like crt.sh, but enterprise teams should automate this. Many modern security tools, including Cloudflare's CT Monitoring, can be configured to send webhooks or Slack alerts the millisecond a new certificate is issued for your domain or its subdomains.
4. Secure Key Storage and Zero Trust
Private keys must never reside in plaintext on disk or in version control.
* Cloud Workloads: Utilize native Key Management Services (KMS) like AWS KMS, Azure Key Vault, or Google Cloud KMS. These services ensure that cryptographic operations happen within the KMS boundary, and the plaintext key material is never exposed to the application layer.
* Code Signing: The CA/Browser Forum now mandates that all Extended Validation (EV) and Organization Validation (OV) code signing keys must be stored on physical hardware tokens (like YubiKeys) or in a cloud-based HSM.
* Internal Microservices: Implement a Service Mesh (such as Istio or Linkerd) to enforce Mutual TLS (mTLS). A service mesh automates the distribution of short-lived certificates (often valid for only 1 to 24 hours) to your containers, ensuring that even if a key is compromised in memory, its window of utility is incredibly small.
5. Establish Bulletproof Expiration Tracking
Even with automation tools like cert-manager in place, things break. Webhooks fail, DNS APIs rate-limit requests, and ACME challenges time out. If your automation fails silently, you will still suffer a catastrophic outage.
You need an independent, synthetic monitoring layer that actively checks the expiration dates of the certificates currently being served to clients. This is where Expiring.at becomes a critical component of your operational resilience.
Instead of relying on internal automation logs, Expiring.at acts as an external source of truth. It continuously monitors your endpoints, tracks the exact validity periods of your deployed certificates, and integrates directly into your incident response workflows. By setting up intelligent alerting via Slack, email, or custom webhooks, your team gets notified well before the 90-day (or 30-day) window closes, catching silent automation failures before they turn into P1 outages.
Conclusion: Moving Toward Crypto-Agility
The days of treating digital certificates as "set it and forget it" infrastructure are over. Between the explosion of machine identities, the upcoming 90-day TLS lifecycle, and the looming threat of post-quantum decryption, organizations must treat PKI as dynamic, highly sensitive