The connected world so many envision won't happen without the secure transfer of information over networks. Here's a breakdown of the existing techniques for making sure our connected devices are safe from Internet outlaws.
Internet-enabled devices are gaining in popularity, but not everyone on the Internet has honorable intentions. Hackers and crackers do exist and may attack your product. Embedded system designers need to be aware of the threats that exist as well as the technologies that protect the integrity of their devices.
This article examines the risks and discusses available security frameworks. Public key infrastructure and digital signatures will be examined. After that, we'll focus on encryption and authentication technologies, concentrating mainly on Secure Sockets Layer (SSL) and Transport Layer Security (TLS). Implications for product development will be explored, including available open-source implementations. By the end of this article, readers will have a basic understanding of the security risks of Internet connectivity as well as the technologies that mitigate them.
This article is not an exhaustive treatise on the subject of Internet security. It's a high-level introduction to an extremely complex subject. For more information, see the references at the end of the article.
What is Internet security?
Generally speaking, Internet security refers to a set of services that permit data to be safely transmitted across the network. Some information passed over the network, such as financial and legal documentation, is of a sensitive nature. Different products and applications have different security needs. Depending on the particular need, security services may be quite limited or very broad in scope.
The available set of Internet security services may be generalized as providers of an increasing state of trust. When Node A needs to communicate a credit card number to Node B, it is prudent to verify that Node B is who it says it is. In this sense, Node A must trust Node B or some third party that can vouch for Node B's identity. The main focus of Internet security is to facilitate a trust relationship between various network nodes.
Of course, this is not a blind trust. Node A may be authorized to invoke certain services on Node B, but banned from invoking others. This depends upon how much Node B trusts Node A.
Each product must determine what security threats exist before a set of Internet security services may be specified. Each product must also evaluate which security protocols are in use by the other nodes on the Internet (that is, most HTTP servers support SSL).
Figure 1: The trust pyramid
Figure 1 illustrates the levels of trust in the form of a pyramid. The base of the trust pyramid is integrity. Integrity ensures that the stream of messages between two network nodes have not been tampered with. A network communications channel that guarantees integrity precludes a malicious third party from tampering with the contents of individual messages, as well as the ordering of a stream of messages, including insertion of extra messages into the stream or deletion of messages from the strem.
Integrity also encompasses authentication. Is Node A really who it claims to be? If the identity of Node A cannot be verified, then communications should not be established. Closely related to authentication is non-repudiation. Non-repudiation ensures that the sender cannot deny the successful transmission of a digitally signed message.
Confidentiality ensures that the sender of the message and the intended receiver are the only two network entities that may decipher the contents of a stream of messages (in a reasonable amount of time). Confidentiality protects the communication channel from passive attacks, such as packet sniffing. Confidentiality requires some type of encryption technology. In confidentiality's most protective form, all information passed between Node A and Node B is encrypted.
At the top of the trust pyramid is authorization. Once a network client's identity has been established, it may be authorized to invoke various services on a server, such as telnet or FTP. Not every network client need have identical authorization rights on a server. While the initial delegation of rights may be a system administrator's job, the networking protocol can provide the framework to establish identity and transfer the authorization rights between network nodes.
Now that the scope of Internet security has been outlined, the building blocks of secure Internet channels will be described. These building blocks include:
- Encryption and decryption
- Digital signatures
- Message digests
- Public-key infrastructure
Encryption and digital signature
Encrypting network communications ensures that confidentiality is maintained over the wire. Two types of encryption algorithms, also known as ciphers, are of interest. The first (and most fundamental) type is symmetric encryption. The most common cipher for symmetric encryption is the Data Encryption Standard (DES). Symmetric encryption schemes are conceptually simple, as illustrated in Figure 2.
Figure 2: Symmetric encryption
In this scheme, Node B (hereafter called Bob) sends an encrypted message to Node A (hereafter called Alice). In the parlance of cryptography, plaintext is the original data and ciphertext is the encrypted data. The cipher uses some shared secret data, also known as the key, to transform the plaintext into ciphertext and back. The key is a piece of data that both the encryption and decryption algorithm must employ.
One simple example of a symmetric cypher is ROT13. ROT13 works on ASCII text by adding 13 to every character code during encryption and subtracting 13 from every character during decryption. These values are bounded by the letters of the alpphabet. Thus A Æ N and P Æ C.
To use a symmetric cypher, Bob and Alice must not only agree on the algorithm, but also on the key. Therein lies one of the disadvantages of symmetric encryption. If Bob and Alice must have a priori knowledge of both the algorithm and the key in order to securely communicate, then how is the communication channel created in the first place? This is a classic chicken and egg scenario. There are two solutions:
- The key (and possibly the algorithm) is sent from Bob to Alice before the secure communications channel is established. The obvious drawback of this approach is that anyone sniffing the packets will see the key and be able to decrypt any subsequent messages.
- The key (and possibly the algorithm) is distributed in a different manner. This may involve hardware devices, a separate networking protocol, or a password used as a key, but it must not be distributed in an insecure way.
Key distribution is a fundamental problem of secure systems. Whitfield Diffie and Martin Hellman proposed some remedies in an article called “New Directions in Cryptography” (IEEE Transactions on Information Theory, 22, 1976). These ideas have since become known as public-key cryptography. Public-key cryptography will be discussed in more detail later. For the present, we'll explore the asymmetric encryption scheme that public-key cryptography is based on. Asymmetric encryption schemes feature two distinct keys, one for encryption and one for decryption. The sender encrypts the message with the encryption key, and the receiver decrypts the message with the complementary decryption key.
Public-key cryptography adds another twist to the mix. Suppose the encryption key was public but the decryption key was private and known only to the receiver. The sender of the message (Bob in this case) would encrypt the message contents with the public key, and the receiver (Alice) would decrypt the message with the private key. Since only Alice knows her private key, Bob can rest assured that Alice is the only person that can decrypt the message. This scheme is illustrated in Figure 3.
Figure 3: Asymmetric encryption using public keys
Some asymmetric cryptography algorithms also work in reverse (RSA is the most popular). With a reversible algorithm, either the private or public key may be used to encrypt while the other key is used to decrypt. Reversible algorithms are useful to prove identity. Imagine that Bob wishes to prove his identity to Alice. Bob would encrypt some well-known data (of which Alice has a priori knowledge) with his private key and transmit the data to Alice. Alice would decrypt the data using Bob's public key and compare the data. If the data match, then Bob has proved his identity, since only Bob could have sent the original message.
Reversible algorithms may also serve as the basis for digital signatures. In this case, Bob would encrypt data with his private key and transmit that data to Alice. Bob cannot later repudiate the transmission to Alice, because only Bob could have encrypted it with his private key. This scheme is illustrated in Figure 4.
Figure 4: Digital signatures using public keys
Asymmetric encryption schemes are not inherently more or less secure than symmetric encryption schemes. They simply put forth a more feasible model for key distribution. The strongest determinants for the strength of a cipher are the length of the key and the processing power required to break the cipher. Any cipher can be broken, given enough time and money. A cipher may be considered computationally secure if:
- The time to break the cipher exceeds the useful lifetime of the encrypted data.
- The cost to break the cipher exceeds the perceived value of the encrypted data.
A major disadvantage of asymmetric encryption, as compared to symmetric encryption, is the processing time encryption requires. For example, the RSA encryption algorithm, which is probably the most widely deployed on the Internet, is based on the difficulty in factoring the product of two very large prime numbers. Symmetric encryption is generally an order of magnitude faster and places a small burden on the processor.
For this reason, security protocols such as SSL/TLS provide mechanisms for secure secret key exchange. Often, the secret keys are communicated through the use of public key encryption. A common example is a secret key generated by Bob, encrypted with Alice's public key and then transmitted to Alice. Because only Alice can decrypt the message, sending the secret key over the wire does not compromise the security of that key. Once Alice receives the message, any further communications may employ a faster cipher, such as DES or one of its variants.
Message digests are algorithms that take a sequence of bits of arbitrary length and a secret key, and output a fixed-length sequence of bits. The output sequence is representative of the contents of the input sequence. These algorithms guarantee the integrity of the message. Change one input bit and the output will change as well. Message digests may also verify identity when the input is combined with a secret key known only to the sender and receiver. The two most widely used message digest algorithms are Message Digest 5 (MD5) and Secure Hash Algorithm 1 (SHA-1). SHA-1 uses a longer key length and is generally considered more secure than MD5.
Digest algorithms are used to calculate a message authentication code (MAC) for the message payload. A MAC acts like a secure checksum. MACs are calculated by using one of the digest algorithms with some combination of a shared secret key and the message payload as the input sequence. MACs provide protection against an active network attack, such as modification or falsification of packets. MACs can be used to guarantee integrity (but not identity) when encryption is not used. Some protocols, such as SSL/TLS, use two MACs (both MD5 and SHA-1). The rationale for using both is that if one is broken, the message is still protected by the other.
A MAC may also be used as a digital signature. Assuming that both the sender and receiver share the same secret key used in the MAC calculation, consider the case when the sender includes the MAC along with the message payload. The receiver can also calculate the MAC using the same shared secret key. If the two MAC values match, then only the sender could have transmitted this message.
Public-keys and certificates
Public-key encryption schemes facilitate three types of network security services: encryption and decryption, digital signatures, and key exchange. This article has not yet addressed the issue of key management. In particular, the following questions should be answered:
- Who generates these key pairs?
- Where are they stored?
- Who is trustworthy enough to guarantee that the generator of these keys is who it asserts it is?
The first question is easily answered. The computer that wants to assert its identity generates keys. For example, Web browsers and some e-mail programs contain software that allows these key pairs to be generated. These keys are then stored on the local computer. Once generated, the public key may be advertised. Protocols such as SSL/TLS contain provisions for network nodes to transmit their public keys upon request.
This scheme is not particularly secure. If any computer can generate key pairs, then authentication is not possible. Malicious users could easily generate key pairs to masquerade as a third party.
Certificate authorities (CAs) are the solution to this problem. When key pairs are created, the public key is submitted to a CA. This authority is responsible for validating the identity and credentials of the person or organization that submits the key pairs. Levels of validation range from verifying name and e-mail address only to verifying the identity of the submitter through a credit card or credit report. Verisign is the most well known CA.
Once a CA has validated the identity of the submitter, it issues a certificate. Certificates enable a trust relationship by establishing the identities of one or both parties in a network transaction. The certificate is digitally signed by the CA. It follows that if the CA is trusted, then the certificate may be trusted, and the public key contained within the certificate may be trusted. Digital certificates come in many formats, but the one of interest to the major Internet security protocols described in this article is X.509, an ITU specification. An X.509 certificate contains:
- The submitter's public key
- The name of the issuer (CA)
- The certificate's lifetime
- The digital signing algorithm
- The digital signature of the issuer
SSL and TLS
The Secure Sockets Layer (SSL) protocol was developed by Netscape and first used in its Navigator product in 1995. The Internet Engineering Task Force (IETF) assumed stewardship of the protocol and eventually published RFC 2246. While SSL and TLS are nearly identical as networking protocols, some of the encryption and MAC algorithms are slightly different. This article will concentrate on the protocol aspects and, for simplicity, call them by the name TLS.
TLS contains provisions for:
- Algorithm negotiation
- Encryption and decryption
- Message authentication (via MACs)
- Key exchange
- Digital signatures
Figure 5: Protocol stack with SSL/TLS
As shown in Figure 5, TLS must be layered on top of some reliable connection protocol. In practice, this reliable connection protocol is TCP. (The WAP protocol suite has a version named Wireless Transport Layer Security, or WTLS, that runs over datagrams.) Higher-level protocols, such as HTTP and SMTP, which are normally run directly over TCP, may also be run over TLS. Internet drafts are now available for using TLS with telnet, FTP, Kerberos, and a few other protocols.
TLS toolkits generally attempt to closely mimic a sockets programming interface. The sockets-like API eases the programming issues for the implementers of the higher-level protocols (such as HTTP).
In practice, two scenarios exist for securing a protocol over TLS. In the first scenario, a different port is used for secure traffic. The higher-level protocol must listen on its normal port as well as its secure port for network messages. In the second scenario, the higher-level protocol must differentiate between normal traffic and secure traffic on its regular port. HTTP uses this second strategy. When a URL begins with “https://”, a secure channel is set up between the client (a Web browser) and the HTTP server.
TLS connections are always role-based. The initiator of the session is always the client. Figure 6 illustrates a sample TLS session that is based on the RSA cipher suites; the variations for other cipher suites are minor.
Figure 6: TLS message exchange diagram
In Step 1, the client opens the TLS connection by sending a “hello” message. This message contains the protocol version number (3.0 for SSL and 3.1 for TLS) and the client-supported types of ciphers, MACs, and compression. (Currently no standard defined compression schemes exist.) The client also sends a sequence of random numbers that are used to generate the master secret. The master secret will be examined later.
In Step 2, the server sends the TLS version it supports and selects one of the cipher/MAC options presented in Step 1. The server also sends a sequence of random numbers that is used to generate the master secret.
In Step 3, the server sends its certificate, which includes the public key of the server. In Step 4, the server informs the client that secret key generation may begin.
As discussed earlier, asymmetric encryption entails a larger computational load than symmetric encryption. That is why protocols such as SSL/TLS and IPSec use symmetric encryption for the application data. The next step for the client is to generate the premaster secret, yet another sequence of random data. The premaster secret is encrypted with the server's public key and sent over the wire in Step 5. At this point both the server and the client generate the master secret. The master secret is a sequence of data that is partitioned to provide the secret keys for client and server encryption as well as the secret keys used for client and server MAC calculations. The master secret never travels over the wire. The master secret is generated from both the client and server random values, as well as the premaster secret. Once the master secret is calculated, secure communications may begin.
In Step 6, the client informs the server that all outgoing traffic from the client will henceforth be encrypted using the agreed-upon encryption suite and independently generated keying material. In Step 7, the client informs the server that the client's portion of the initial connection handshake is complete. As forewarned in Step 6, the Finished message in Step 7 is encrypted.
Step 8 permits the server to inform the client that all outgoing traffic originating from the server will now be encrypted. In Step 9, the server sends an encrypted Finished message to the client to indicate that the server's portion of the initial connection handshake is complete.
Steps 10 and 11 contain the encrypted application layer traffic passing between the client and server. At some point, either the client or server may wish to end the connection. If so, an explicit Alert message is transmitted.
Figure 7: TLS record layer
The format of TLS messages is rather simple. As illustrated in Figure 7, four types of messages are possible: Handshake, Alert, Change Cipher Spec, and Application Data.
Figure 8: TLS record format
Each transmission across the wire is broken into a series of one or more TLS records. The format of a TLS record is shown in Figure 8.
The U.S. government has traditionally restricted the strength of some of the ciphers for exported products. These restrictions were substantially relaxed in January 2000, although some restrictions still exist for products exported to embargoed nations, such as Cuba and Iraq.
Several commercial and open-source implementations exist for TLS solutions. Commercial C language implementations are available from RSA Security, Certicom, and SPYRUS. The most popular C language open-source version is OpenSSL (www.openssl.org), but others exist as well. The open-source versions are usually tailored towards integration with the Apache webserver.
Many network protocols have existing or proposed security extensions.
HTTP has two methods of authentication, basic and digest. In both, the server refuses to serve a request until the client authenticates itself through a name/password pair. When acting as the HTTP client, a Web browser will display a dialog box to allow the user to type a name/password combination. The request is then resubmitted to the server.
Basic authentication does not encrypt the name/password pair as it goes over the wire, although it does encode it in the base64 encoding system. In digest authentication, when the server refuses to serve a request, it provides a nonce to the client. A nonce is a one-time-use piece of data. Once the client has obtained the name/password pair, it performs a hash of the username, the password, the nonce, the HTTP method, and the requested URL. The client returns only the hash value to the server. In this way, the password is never sent in cleartext. Though defined for several years, digest authentication is only now being more widely implemented.
The key point
This article has described the basics of Internet security protocols and provided an examination of TLS, one of the most widely deployed protocols. While we didn't examine the mathematics that form the basis of cryptography, the algorithms are well-documented and realizable in software, although computationally intensive. Despite the complexity of the ciphers, most of the framework surrounding an Internet security protocol deals with key management. Key management is a challenge for all of these protocols, especially for embedded solutions. Even so, the technologies exist to protect your devices' network traffic from Internet-savvy malefactors.
Steve Kapp is the chief technologist for EMRT Consultants. He has 14 years of software and systems development experience, focused mainly on embedded imaging systems. Steve invites contact and may be reached at .
Rescorda, Eric. SSL and TLS: Designing and Building Secure Systems. New York: Addison-Wesley, 2000.
Stallings, William. Network Security Essentials, Applications and Standards. Englewood Cliffs, NJ: Prentice Hall, 2000.
Menezes, Alfred, Paul van Oorschot, and Scott Vanstone. Handbook of Applied Cryptography: www.cacr.math.uwaterloo.ca/hac/
RFC 2246: The TLS Protocol Version 1.0