CMP EMBEDDED.COM

Login | Register     Welcome Guest  
HOME DESIGN PRODUCTS COLUMNS E-LEARNING CONFERENCES CODE FORUMS/BLOGS NEWSLETTERS CONTACT FEATURES RSS RSS

Implementation of the AES algorithm on Deeply Pipelined DSP/RISC Processor
A more efficient implementation of the Advanced Encryption Standard algorithm on a deeply pipelined RISC/DSP engine reduces overall pipeline stalls during its execution.



Embedded.com
In the modern digital computer world, cryptography algorithms play an important role in securing important data. In early days, cryptography was used to protect classified information in government and military applications. Now, with Internet access is a common commodity, government agencies, various industries ranging from Hollywood to corporate, financial institutions and universities make widespread use of cryptography algorithms in everything from storage management to web based online transaction processing, etc.

Various cryptography algorithms are used in practice, depending on the application type and the required level of security. The block diagram of a general secured communication system is shown in Figure 1 below.

Figure 1: Communication security modules (a) encryption (b) decryption

For communications security, plan text is encrypted using a secret (or shared) key at the transmitter side and then decrypting the ciphered text at the receiver side using the same key. In this secured communication system, tapping the data by adversaries is almost impossible unless he/she gets the secret key pattern. In the present day, an embedded processor is commonly used to implement the physical layer of such a secured communication system.

Advanced Encryption Standard (AES) is the latest data security standard adopted worldwide in most of the public and private sector for secure data communications and data storage purposes. The AES algorithm is a symmetric key algorithm, standardized by the National Institute of Science and Technology (NIST).

The AES standard [1] specifies the Rijndael algorithm that can process data blocks of 128 bits at a time using the key sizes of 128, 192 or 256 bits length. AES encryption and decryption are based on four different transformations that are applied repeatedly in a certain sequence on input data. The AES standard also specifies a key expansion module to supply keys for multiple iterations of AES algorithm. Depending on input key length, the number of iterations (or complexity) of the AES algorithm will vary.

Since its standardization in 2001, much research has been done to efficiently implement the AES algorithm both in software and hardware. In [2] and [3], various techniques are proposed to implement AES on embedded processors with limited resources.

In [4], an efficient implementation of AES on 32-bit platforms is discussed. In this paper, we discuss a more efficient implementation of AES for use on deeply pipelined embedded processors, keeping in mind the possibility of processor pipeline stalls and stringent memory requirements.

A review of the current AES algorithm
The data flow diagram of the AES algorithm is shown in Figure 2 below. The main transformations in the AES Rijndael's algorithm are:

1) AddRoundKey (AR),
2) SubstiteBytes (SB),
3) ShiftRows (SR) and
4) MixColumns (MC).

All these transformations work on a matrix called 'State' that is formed with byte elements (or 8-bit data units) using the input data of 128-bits. AES State is updated in multiple iterations using the above transformations. The key expansion (KE) module expands the given key for supplying the keys to all iterations of the AES engine.

The number of times the State is iterated in a loop of the AES algorithm depends on the chosen key length (Nk). For example, if we choose the key length of 128 bits (i.e. Nk=4 32-bit words), then we iterate the data (Nr-1) times, where Nr = Nb + Nk + 2 and Nb= 4.

In the AES algorithm, the parameter Nb(= 4) corresponds to the number of rows of State. In the AES cipher (or encryption) process, the AR transformation is applied before the start of the loop. The transformations present within the AES cipher loop are SB, SR, MC and AR.

Also, the transformations SB, SR and AR are applied after the loop, but before outputting the cipher text. Since the AES algorithm works on byte elements, we form the AES State of 4x4 bytes size by filling the 4x4 matrix column wise using the input data of 128 bits. The key expansion module (KE) expands the input Nk words to (Nr+1)*Nb words. (See [1] for more details of AES key expansion process.)

In the Add Round key transformation (AR), we add 16 bytes (or 4 words) of the expanded key to 16 bytes of AES State whenever AR is called. In the Substitute Bytes (SB) transformation, we replace each byte of AES State with the substitute byte look-up table S-Box elements using State bytes as an offset to the look-up table. Refer to [1] for substitute byte look-up table values.

In the Shift Rows (SR) transformation, we rotate 4x4 State rows (except the first row) to the left by the row index number of bytes (i.e. rotate left first row by 1 byte, second row by 2 bytes and third row by 3 bytes). In the Mix Columns (MC) transformation, we multiply the State with the MC matrix in Galois field GF(28).

Figure 2: Flow diagram of AES Encryption engine

In the inverse cipher (or decryption) process, we take cipher text as input and perform all inverse operations (i.e. Inverse Substitute Bytes (ISB), Inverse Shift Rows (ISR), Inverse Add Round key (IAR) and Inverse Mix Columns (IMC) transformations) of the cipher transformation to get the plain text back.

We use different look-up table values for implementing SB and ISB (see [1]). The way we add the round keys to State is the same except that in AR we get keys indexing from start of the buffer whereas in IAR we get keys from the end of expanded key buffer.

We rotate the State row elements to the left in SR whereas in ISR we rotate the State rows to the right. We use the following matrices A and B to multiply the State elements with MC and IMC transformations.

The matrices A and B contain the same four basic elements in each column or in each row. With respect to first column or row the other columns or rows elements are relatively rotated.

1 | 2 | 3

Rate this article: Low High
Current rating
  • .
Embedded.com Career Center
Ready for a change?
SEARCH JOBS

Browse all jobs

SPONSOR
RECENT JOB POSTINGS





 :