Somewhere right now, a production service is down because a certificate expired and nobody noticed. It happens to the biggest names — Microsoft Teams went down in 2020 because of an expired certificate. Equifax missed a breach detection because an expired cert disabled their SSL inspection tool. Let’s Encrypt’s root expiry in 2021 broke millions of devices. Certificates are invisible infrastructure until they break, and then they’re the only thing anyone can see.

TLS certificates are the foundation of trust on the internet. They authenticate servers, encrypt connections, and establish that you’re talking to who you think you’re talking to. Managing them well is boring, unglamorous work. Managing them poorly is how you end up on the front page. Here’s how to get the boring part right.

DO / DON’T

DO:

DON’T:

Certificate Lifecycle Management

Step 1: Build Your Certificate Inventory

Before you can manage certificates, you need to know what you have. Scan your environment:

For each certificate, record: domain/CN, issuer (CA), issuance date, expiration date, key algorithm and size, hosting location/service, responsible team, and renewal method (manual or automated).

Step 2: Automate Renewal with ACME

The ACME protocol (Automated Certificate Management Environment) is the standard for automated certificate issuance and renewal. Let’s Encrypt is the most widely used ACME CA, issuing free, domain-validated certificates with 90-day lifetimes.

Certbot is the reference ACME client:

# Install certbot
sudo apt install certbot python3-certbot-nginx

# Obtain and install a certificate for Nginx
sudo certbot --nginx -d yourdomain.com -d www.yourdomain.com

# Auto-renewal is configured automatically via systemd timer
sudo systemctl status certbot.timer

Other ACME clients:

For internal certificates where Let’s Encrypt isn’t appropriate, tools like step-ca provide a private ACME CA for internal PKI.

Step 3: Monitor Expiration

Automation handles renewals, but monitoring catches failures. Layer your monitoring:

echo | openssl s_client -connect yourdomain.com:443 -servername yourdomain.com 2>/dev/null | openssl x509 -noout -dates

Alert thresholds: 60 days (informational), 30 days (warning), 14 days (critical), 7 days (emergency). If you’re getting 7-day alerts, your automation is broken.

Step 4: Certificate Transparency Monitoring

Certificate Transparency (CT) is a public, append-only log of all publicly trusted certificates. Every CA is required to log certificates they issue. This means you can monitor for unauthorized certificates issued for your domains.

Why this matters: if an attacker compromises a CA or tricks one into issuing a certificate for your domain, the CT log will show it. Without monitoring, you’d never know until the certificate was used in a man-in-the-middle attack.

How to monitor:

Set up monitoring for every domain you own. When a certificate appears that you didn’t request, investigate immediately — it could be a subdomain takeover, a compromised CA account, or a misissuance event.

TLS Configuration

Protocol Versions

Cipher Suite Configuration

Use Mozilla’s SSL Configuration Generator to generate tested configurations for your web server. Choose the “Intermediate” profile for broad compatibility or “Modern” for TLS 1.3-only.

Key principles:

Security Headers

Beyond the TLS configuration itself, deploy these headers:

Test your configuration with SSL Labs Server Test. Aim for an A+ rating. Anything below an A means you have deprecated protocols or weak ciphers enabled.

Key Management

Private Key Protection

Key Rotation

Rotate private keys with every certificate renewal, not just the certificate itself. A renewed certificate with the same private key means a previously compromised key is still in play.

For ACME-managed certificates, most clients generate a new key pair with each renewal by default. Verify this behavior in your client’s configuration.

Revocation

When a private key is compromised, the certificate must be revoked immediately. Don’t wait for expiry.

Revocation Methods

How to Revoke

After revocation, reissue the certificate with a new key pair. Revocation without reissuance leaves the service down.

If It Already Happened

If a certificate has already expired and caused an outage:

If a private key was compromised:


Certificates are infrastructure. Treat them like any other critical system: inventory them, monitor them, automate their lifecycle, and have a plan for when things go wrong. Start with your certificate inventory. Find every cert in your environment, record its expiry, and set up monitoring today. Then automate renewal with ACME. The outage you prevent is the one nobody notices.