Introduction

Recently, I’ve been considering building an Internal PKI, so I started learning about security and encryption-related knowledge. Before studying PKI, my understanding of certificates was still vague, thinking their main use might just be implementing encrypted HTTP requests. However, after reading some related blogs and documentation, I discovered that it’s far more complex than I understood!

You must remember various concepts and apply them to actual requirements to strengthen your understanding of these concepts.

In the security field, many things cannot be done perfectly, and the involved domains are both broad and deep. For example, it involves mathematics, encryption, algorithms, and more. But what’s most confusing are the various certificate formats and encryption algorithms - just remembering the names is already a headache for most people!

Internet Security Issues

With the development of the Web, three main problems have been exposed:

  • Communication data may be stolen: Since HTTP itself doesn’t have encryption capabilities, all traffic is transparent, making it easy to steal sensitive data like login passwords;
  • Communication data may be tampered with: Since traffic is transparent, it can be intercepted by a man-in-the-middle and the communication data can be altered;
  • Unable to verify the identity of the communicating party: Since HTTP doesn’t have server identity authentication functionality, phishing can be conducted using domains similar to the ones being impersonated.

To solve these three problems, SSL/TLS was created. So how does the SSL layer solve the above issues?

SSL/TLS Solves the Problem of Communication Data Theft

The functionality of SSL/TLS mainly relies on three basic types of algorithms: hash functions, symmetric encryption, and asymmetric encryption. It uses asymmetric encryption for identity authentication and key negotiation, symmetric encryption algorithms with negotiated keys for data encryption, and hash functions to verify information integrity.

The advantage of using symmetric keys is that decryption is relatively fast. The advantage of using asymmetric keys is that the transmitted content cannot be decrypted because even if you intercept the data, without the corresponding private key, you cannot decrypt the content. So we can combine symmetric encryption with asymmetric encryption, fully utilizing the advantages of both. In the key exchange phase, we use asymmetric encryption, and in the subsequent communication exchange phase, we use symmetric encryption.

Specifically, the party sending the ciphertext encrypts the symmetric key using the recipient’s public key, and then the recipient decrypts the symmetric key using their private key. This ensures that the exchanged key is secure while using symmetric encryption for communication. Therefore, HTTPS adopts a hybrid encryption mechanism that uses both symmetric and asymmetric encryption.

Solving the Problem of Data Tampering

Network transmission needs to go through many intermediate nodes. Although the data cannot be decrypted, it may be tampered with. So how do we verify the integrity of the data? In this case, we can consider using digital signatures for verification. Digital signatures typically have two effects:

  • They can confirm that the message was indeed signed and sent by the sender, as others cannot forge the sender’s signature
  • Digital signatures can confirm the integrity of the message, proving whether the data has been tampered with

The recipient can only decrypt the encrypted digest information using the sender’s public key, and then use a hash function to generate a digest of the received original text, comparing it with the digest obtained in the previous step. If they match, it indicates that the received information is complete and has not been modified during transmission; otherwise, it indicates that the information has been modified. Therefore, digital signatures can verify the integrity of information.

Solving the Problem of Identity Verification

We mentioned above that digital signatures sound great! You can use a public key for decryption, proving that it was indeed sent by the private key holder. But have you considered what if the public key itself has been tampered with? So to ensure that the public key is trustworthy, this introduces the Certificate Authority (CA, also known as the third-party authentication center).

The only difference here is that, assuming the algorithm for encrypting website information is MD5, after MD5 encryption, it is encrypted again using the private key of a third-party institution. In this way, the digital certificate contains two particularly important pieces of information: a website’s public key + digital signature. If a man-in-the-middle intercepts and replaces the server’s public key with their own, the digital signature’s presence will cause the client to detect that the signature doesn’t match, thus preventing the problem of public key replacement by a man-in-the-middle.

Let’s look at how the client compares the two digital signatures:

  • Browsers install public keys from some authoritative third-party certification authorities, such as VeriSign, Symantec, and GlobalSign;
  • When verifying a digital signature, they directly obtain the corresponding third-party public key from the local system, decrypt the digital signature encrypted by the private key to get the actual signature;
  • Then the client generates a signature using the signature generation rules and checks if the two signatures match. If they match, authentication passes; if not, certificate acquisition fails.

There’s an issue here: if there’s third-party certification, why don’t we directly use third-party certification methods for encryption?

This is because third-party certification authorities are public platforms that man-in-the-middle attackers can access. If we only encrypt website information with a third-party institution’s private key, we would still be susceptible to deception.

Without authentication, a man-in-the-middle could also apply to a third-party certification authority, then intercept and replace all information with their own. The client would still be able to decrypt it and couldn’t determine whether it’s from the server or the man-in-the-middle, ultimately leading to data leakage. This is actually the role of digital signatures.

HTTPS Workflow

HTTPS Workflow

  • Client initiates an HTTPS request, connecting to port 443, which can be understood as the process of requesting a public key
  • After receiving the request, the Server encrypts with a third-party institution’s private key and sends the digital certificate (also known as the public key certificate) to the Client
  • Client verifies the public key certificate, such as whether it’s within the validity period, whether the certificate’s purpose matches the site requested by the Client, whether it’s in the CRL revocation list, whether its higher-level certificate is valid (this is a recursive process until verification reaches the root certificate - either the operating system’s built-in Root certificate or the Client’s built-in Root certificate). If verification passes, it continues; if not, a warning message is displayed
  • Client uses a pseudo-random number generator to generate a symmetric key for encryption, then encrypts this symmetric key with the certificate’s public key and sends it to the Server
  • Server uses its private key to decrypt this message and obtains the symmetric key. At this point, both Client and Server hold the same symmetric key
  • Server uses the symmetric key to encrypt plaintext content A and sends it to the Client
  • Client uses the symmetric key to decrypt the ciphertext response and obtains plaintext content A
  • Client initiates another HTTPS request, encrypts the plaintext content B of the request using the symmetric key, and then the Server decrypts the ciphertext using the symmetric key to obtain plaintext content B

So What Is a Certificate?

After understanding the entire HTTPS workflow, we arrive at a conclusion:

The purpose of a certificate is to bind an entity’s Identity to its corresponding Public Key, and sign it to prevent forgery or tampering.

The entity mentioned above is anything that exists, even if it only exists logically or conceptually.

For example:

  • The computer you use is an Entity;
  • The code you write is also an Entity;
  • You yourself are an Entity;
  • The multigrain pancake you had for breakfast is also an Entity;
  • The ghost you saw when you were six is also an Entity — even if your mother told you there are no ghosts in the world, it’s just your imagination.

We can use tools to observe what a certificate obtained by a browser looks like (where #– XXX –# is content I’ve added for annotation):

step certificate inspect https://george.betterde.com
Certificate:
    Data:
        Version: 3 (0x2) #-- X.509 v3 certificate --#
        Serial Number: 277028619957168996496866148377847610215553 (0x32e1d2b7678a3cde0a0aff3d8973da00c81) #-- Unique Serial Number --#
    Signature Algorithm: SHA256-RSA
        Issuer: C=US,O=Let's Encrypt,CN=R3 #-- Issuing authority: Let's Encrypt --#
        Validity #-- Certificate start validity period and expiration date --#
            Not Before: Nov 15 07:22:21 2022 UTC
            Not After : Feb 13 07:22:20 2023 UTC
        Subject: CN=george.betterde.com
        Subject Public Key Info:
            Public Key Algorithm: RSA #-- Certificate public key --#
                Public-Key: (4096 bit)
                Modulus:
                    a3:5d:2a:3c:6c:f7:b0:d6:6d:c1:97:0f:42:c4:7c:
                    7e:76:f7:5a:02:30:8b:12:fb:52:ca:aa:14:f3:a1:
                    45:90:48:41:27:76:4a:14:be:a8:28:24:45:d8:4f:
                    8f:dd:a5:0e:56:bb:f8:a1:65:05:cd:f5:fb:a3:0b:
                    77:da:89:30:64:c5:4e:ec:f4:be:b4:ae:6a:75:24:
                    25:6b:61:12:3d:df:ef:47:60:f9:fd:20:d0:d3:3e:
                    57:c8:c4:73:7e:ea:89:c3:df:6e:f8:a9:72:60:b3:
                    d4:b9:83:27:74:c4:cb:19:49:c2:1f:87:a8:a2:72:
                    50:11:f0:4f:4a:dc:1d:4f:3b:f7:df:a0:e3:07:e6:
                    33:45:cf:96:0a:ab:4f:90:ad:5e:8b:25:bb:c5:0d:
                    93:bb:7c:da:88:ae:c0:97:e9:f4:1d:a4:c1:82:00:
                    9b:3a:cb:a7:75:72:1f:12:ee:22:63:de:49:e9:dc:
                    fe:d3:79:ec:e5:a1:18:19:5b:ae:c2:a1:19:d8:73:
                    78:3d:45:9b:d1:32:f3:c8:a0:39:c3:f4:fc:80:7a:
                    70:de:c1:ac:a4:bf:92:b2:d7:d1:66:d6:b3:1a:de:
                    80:ed:53:6a:45:ef:87:4f:d1:c7:3a:1a:60:98:1d:
                    e6:f6:7d:17:07:38:4d:91:29:4c:9b:01:ef:f4:d3:
                    ea:d8:c5:15:e2:aa:01:a5:c7:fb:7e:fb:eb:7d:7a:
                    0f:7f:cc:41:cf:c9:31:a7:2e:2e:8a:c1:2e:f3:55:
                    7f:3b:aa:b5:76:58:b2:dd:a1:81:b8:1c:9d:f7:57:
                    56:93:a3:61:88:07:23:4c:86:bc:38:15:bd:17:0a:
                    88:f1:bc:4f:b7:83:b7:f8:0b:81:46:41:b4:a8:ac:
                    38:96:b9:8b:8a:32:98:8b:0c:fb:26:07:b9:c7:38:
                    c1:f8:f3:92:6a:b4:f2:38:aa:2c:78:18:ae:05:3c:
                    78:e1:4d:c9:5a:2f:8e:14:a1:e4:bf:2e:5a:f0:a0:
                    a6:37:56:e2:c0:76:d1:df:f4:8b:49:12:f2:f4:40:
                    ae:a6:63:44:cb:22:df:c9:7a:72:61:0a:68:56:1b:
                    da:a5:6a:21:98:3e:a3:d7:11:36:fa:82:f2:e8:43:
                    83:b0:b7:5a:ba:7c:b6:54:6b:cf:b6:bb:64:b1:17:
                    a6:6f:74:e2:af:51:62:af:63:d3:37:03:a7:23:f3:
                    25:15:bd:fa:0e:e1:25:b3:20:48:98:d6:45:8c:70:
                    3f:30:84:f4:83:8d:96:ad:b9:7f:f4:ac:a3:b8:08:
                    9c:55:8c:df:61:e1:d1:67:05:cf:63:82:ae:a4:96:
                    97:4e:ad:b3:15:33:20:ec:32:a9:c9:fd:0a:20:07:
                    4a:db
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment #-- Describes public key usage --#
            X509v3 Extended Key Usage:
                Server Authentication, Client Authentication #-- Describes public key usage --#
            X509v3 Basic Constraints: critical
                CA:FALSE #-- Not a CA certificate --#
            X509v3 Subject Key Identifier:
                91:AC:86:05:F8:94:F1:26:6F:4A:CE:DA:04:F2:69:52:00:55:81:D2
            X509v3 Authority Key Identifier:
                keyid:14:2E:B3:17:B7:58:56:CB:AE:50:09:40:E6:1F:AF:9D:8B:14:C2:C6
            Authority Information Access:
                OCSP - URI:http://r3.o.lencr.org #-- OCSP --#
                CA Issuers - URI:http://r3.i.lencr.org/ #-- Issuer URL --#
            X509v3 Subject Alternative Name:
                DNS:george.betterde.com #-- SAN, corresponding to the Entity Name in the text --#
            X509v3 Certificate Policies: #-- Policies followed by the CA --#
                Policy: 2.23.140.1.2.1
                Policy: 1.3.6.1.4.1.44947.1.1.1
            RFC6962 Certificate Transparency SCT:
                SCT [0]:
                    Version: V1 (0x0)
                    LogID: tz77JN+cTbp18jnFulj0bF38Qs96nzXEnh0JgSXttJk=
                    Timestamp: Nov 15 08:22:21.996 2022 UTC
                    Signature Algorithm: SHA256-ECDSA
                      30:44:02:20:2a:3f:27:09:57:5c:f2:09:95:59:1c:8b:7b:c3:
                      f0:52:71:b1:39:49:05:4d:9b:fd:6a:f5:da:43:e0:21:af:b5:
                      02:20:29:90:1c:36:36:db:8f:0d:15:79:b7:66:50:48:22:7d:
                      6d:e0:f3:6c:64:88:8c:77:9a:ae:bb:ad:3f:32:16:2c
                SCT [1]:
                    Version: V1 (0x0)
                    LogID: ejKMVNi3LbYg6jjgUh7phBZwMhOFTTvSK8E6V6NS61I=
                    Timestamp: Nov 15 08:22:22.064 2022 UTC
                    Signature Algorithm: SHA256-ECDSA
                      30:44:02:20:11:01:00:93:31:9e:b6:f5:28:bd:2d:c2:88:f0:
                      d2:c7:43:a9:c4:4b:89:ab:c2:76:53:55:27:60:b7:e4:e0:e8:
                      02:20:3d:a6:ec:68:1a:e2:60:6d:a7:62:87:70:be:de:c2:08:
                      86:03:07:ee:4d:b4:c0:b8:c0:32:8e:80:9b:c9:fd:67
    Signature Algorithm: SHA256-RSA #-- Public key signature, used to verify if the public key has been tampered with --#
         4a:b6:6e:1b:27:db:1f:7e:4c:13:c3:21:18:06:88:81:d7:30:
         b4:7d:26:d4:4a:30:ca:15:cc:5f:ca:42:aa:ee:69:90:ae:8b:
         98:e1:ea:63:69:53:ff:a2:f6:86:d4:04:33:12:57:2a:df:fc:
         83:be:b5:2f:6c:41:62:11:e2:3b:60:56:ad:0a:a3:5f:45:6b:
         49:1b:8b:65:ca:f7:a9:17:97:06:9f:1b:0e:06:0f:70:c5:66:
         3d:72:ca:11:d0:b4:4b:73:91:58:df:6a:7f:09:b5:5c:8b:40:
         51:48:98:f9:5c:65:6c:8f:83:9b:f0:e7:86:5f:c6:e8:50:ed:
         7f:55:1b:88:86:d8:f6:e5:95:21:60:f4:fa:88:86:f2:28:7f:
         72:79:27:80:6d:c7:ff:25:6d:4c:41:dd:c7:5b:91:4b:08:28:
         26:33:0e:16:93:7f:1e:7d:86:a9:ce:0d:5b:31:85:b2:83:fd:
         1f:c7:9c:1d:f3:43:9b:c1:8e:8b:1c:80:f3:65:5c:80:e6:42:
         40:15:e5:20:2e:66:4d:bd:a7:02:e1:5d:ca:05:dc:59:bb:46:
         e8:f9:34:35:33:cb:d6:e5:65:6a:b7:40:60:aa:9b:26:d1:68:
         ee:ed:80:18:56:ec:52:01:3e:fa:30:cf:f5:dc:e4:d9:0f:32:
         ab:81:ec:24

As you can see, the main information included in the certificate is:

  • Issuer information
  • Owner information
  • Public key information
  • Policies followed
  • CA’s signature on the certificate

Types of Certificates

DV (Domain Validation)

DV certificates bind to a DNS Name. When issuing, the CA needs to verify that this Domain Name is indeed controlled by the Subscriber.

OV (Organization Validation)

OV and the EV certificates (which will be introduced below) are built on top of DV certificates. They include location information of the organization that owns the Name and Domain Name. OV and EV certificates not only associate the certificate with the domain name but also with the legal entity that controls the domain name.

The verification process for OV certificates is not standardized across different CAs. To solve this problem, the CAB Forum introduced EV certificates.

EV (Extended Validation)

EV certificates contain the same basic information as OV but require strict verification (Identity Proofing). This type of certificate displays the certificate owner’s organizational information in the browser’s URL address bar.

However, the actual usage rate of this type of certificate is not high because the cost is too high. First is the time cost (longer review time, possibly requiring several weeks), second is the financial cost (possibly requiring tens of thousands of dollars), and third, Web PKI RP does not strongly depend on it.

Certificate Lifecycle

At the 49th CA/Browser Forum meeting, Apple announced that to improve network security, any new website certificate with a validity period exceeding 398 days will not be trusted by the Safari browser. However, certificates issued before the deadline (September 1, 2020) are not affected by this rule.

In simple terms, when an SSL certificate simultaneously meets the following conditions, it will not be trusted by Safari:

  1. Issued after September 1, 2020
  2. Validity period exceeds 398 days

Certificates issued before September 1, 2020, are not affected. The purpose of this move is to improve website security by ensuring developers use certificates with the latest encryption standards and reduce the number of overlooked old certificates, which could potentially be stolen and used for phishing and malware attacks.

Expiration

Certificates typically expire, although this is not mandatory, it’s generally done. Setting an expiration time is very important!

  • Certificates are distributed in various places: Usually, when an RP verifies a certificate, there is no central authority that can sense (this operation).
  • Without an expiration time, certificates would be valid permanently.
  • One experience in the security field is: the longer time passes, the closer the probability of credential leakage approaches 100%.

Therefore, setting an expiration time is very important. Specifically, X.509 certificates include a valid time range:

  • Issued at
  • Not before
  • Not after: After this time, the certificate expires.

This mechanism seems well-designed, but it actually has some shortcomings:

  • First, nothing prevents an RP from erroneously (or due to poor design) accepting an expired certificate;
  • Second, certificates are distributed. Verifying whether a certificate has expired is the responsibility of each RP, and sometimes they mess up. For example, when the system clock they rely on is incorrect. The worst case is when the system clock is reset to the Unix epoch (1970.1.1), at which point it cannot trust any certificate.

On the Subscriber side, after a certificate expires, the private key should be handled properly:

  • If a key pair was previously used for signing/authentication (e.g., based on TLS),
    • The private key should be deleted immediately after the key pair is no longer needed.
    • Keeping an invalid signing key leads to unnecessary risk: it’s no longer useful to anyone but could be used for forging signatures.
  • If the key pair is used for encryption, the situation is different.
    • As long as there is still data encrypted with this key, the private key needs to be kept.

This is why many people say not to use the same set of keys for both signing and encryption. Because when a private key used for signing expires, optimal key lifecycle management cannot be achieved: ultimately, you have to keep this private key because it’s still needed for decryption.

Renewal

When a certificate is about to expire, if you still want to use it, you need to renew it. Web PKI actually doesn’t have a standard renewal process:

  • There is no standard way to extend the legal time of a certificate;
  • Generally, an expired certificate is directly replaced with a new one;
  • Therefore, the renewal process is the same as the issuance process: generate and submit a CSR, then complete Identity Proofing.

Revocation

If a private key is compromised, or a certificate is no longer in use, it needs to be revoked. That is, you want:

  • To explicitly mark it as invalid;
  • All RPs to no longer trust this certificate, even if it has not yet expired.

But in reality, the process of revoking certificates is also a mess. The difficulties of actively revoking certificates are as follows:

  • Similar to expiration, the responsibility for executing revocation lies with the RP;
  • Unlike expiration, revocation status cannot be encoded in the certificate. RPs can only rely on some out-of-band process to determine the revocation status of a certificate.

Unless explicitly configured, most Web PKI TLS RPs do not pay attention to revocation status. In other words, by default, most TLS implementations are happy to accept revoked certificates.

Additionally, common active checking mechanisms include:

  • CRL (Certificate Revocation Lists, RFC 5280)
  • OCSP (Online Certificate Signing Protocol, RFC 2560)

Conclusion

In this article, I’ve broadly understood the HTTPS process and related details about certificates. I will continue to explore PKI and encryption-related content in the future.

The encryption field has heavy historical baggage, making these current things very frustrating to learn and use. This is more frustrating than not wanting to learn a technology because it’s too difficult.

I hope this is helpful, Happy hacking…