CMP EMBEDDED.COM

Login | Register     Welcome Guest  
HOME DESIGN PRODUCTS COLUMNS E-LEARNING CONFERENCES CODE FORUMS/BLOGS NEWSLETTERS CONTACT FEATURES RSS RSS

Understanding Crypto Performance in Embedded Systems: Part 2
Standards & Industry Practices for Measuring Cryptographic Performance



Embedded.com
Specific Results
Now that we have defined hardware and software sources of variation and testing methodologies to help produce normalized results, this article now looks at specific IPsec measurements on Freescale PowerQUICC processors and how these results illustrate some of the concepts discussed previously.

As the PowerQUICC performance graphs in this part in the series show, system performance is limited by the CPU for small packets and by the Freescale Integrated Security Engine (SEC) for large packets, with memory bus performance equally affecting throughput for all packet sizes.

Publicly available performance curves from competing products demonstrate that the same fundamentals are at play in their devices, and the superior IPsec performance of PowerQUICC (especially when paired with optimized IPsec stacks such as Mocana's NanoSec) results from a combination of higher performance CPUs, more efficient single-pass crypto acceleration cores and wider, faster, effectively pipelined buses.

All performance graphs were measured with a Smartbits SMB600 as both packet generator and packet counter. The Smartbits Terametrics module generates clear IPv4 packets at maximum rate and transmits them to one of the Ethernet ports of PowerQUICC board 1.

PowerQUICC board 1 classifies the packet as belonging to an IPsec session and performs ESP tunneling encapsulation using 3DES-HMAC-SHA-1 before forwarding it to one of the Ethernet ports of PowerQUICC board 2.

PowerQUICC board 2 classifies the packet as belonging to an IPsec ESP session to be terminated on board 2 and decapsulates/decrypts the packet before forwarding a clear IPv4 packet back to the Smartbits machine.

Unless otherwise noted, all performance numbers shown reflect the aggregate bi-directional IPsec packet forwarding rate of each PowerQUICC device. 3DES-HMAC-SHA-1 was selected as the ciphersuite for this measurement because it is still the most commonly used IPsec ciphersuite.

It is also the worst-case algorithm combination for the SEC and probably for other crypto-accelerators. The system-level performance difference between 3DES-HMAC-SHA-1 and AES-HMAC-SHA-1 is negligible in PowerQUICC for all but the largest packets.

Benchmark measurements on PowerQUICC II Pro platforms
The PowerQUICC II Pro MPC83xx series of integrated communications processors represents the low end of the PowerQUICC product line. These devices use the e300 Power Architecture processor core at frequencies up to 667 MHz.

Some members of the MPC83xx series use reduced featured versions of the SEC core (accelerating 3DES, AES, HMAC MD5 and SHA-1); others have full featured SECs, which additionally accelerate public key, ARC-4, and random number generation.

Measurement Configuration #1. Shown in Figure 4 below is the first configuration containing the MPC8313E PowerQUICC II Pro integrated communications processor, with the 32-bit e300 Power Architecture core, and SEC 2.2 and the following configuration parameters:

MPC8313E RDB
e300 core at 333 MHz, DDR at 333 MHz data rate, and SEC at 166 MHz
OS: Linux 2.6.21
IPsec stacks: StrongSwan, OpenSwan, Mocana NanoSec, all running 3DES-HMAC-SHA-1

The chart shows the Mocana NanoSec IPsec stack (http://mocana.com/NanoSec.html) as having higher throughput at all packet sizes. The Mocana performance advantage over OpenSwan is relatively constant at 1.7x, while the advantage over StrongSwan starts at 1.6x but grows to 2.2x as packet size increases.

Both OpenSwan and Mocana operate asynchronously, while StrongSwan processes packets one at a time, and waits for the SEC to complete processing before continuing. At small packet sizes, StrongSwan's greater efficiency and avoidance of SEC interrupts due to polling operations allow it to slightly outperform OpenSwan, but OpenSwan overtakes StrongSwan at medium packet sizes.

Figure 4. MPC8313E IPsec Performance

Measurement configuration #2 . Figure 5 below shows security performance of a measurement configuration platform containing MPC8323E, based on the the 32-bit e300 Power Architecture core, and SEC 2.2, with the following parameters:

MPC8323E RDB
e300 core at 333 MHz, DDR at 266 MHz data rate, and SEC at 133 MHz
OS: Linux 2.6.20.6
IPsec stacks: NetKey, StrongSwan, OpenSwan, Mocana NanoSec, all running 3DES-HMAC-SHA-1

Figure 5. MPC8323E IPsec Performance

The chart provides a complete comparison of the most popular open source IPsec stacks and the Mocana stack running on the same Linux kernel version. Netkey performance with both hardware and software encryption are included in the comparison.

As on the MPC8313E in Configuration #1, the Mocana NanoSec IPsec stack has the highest throughput at all packet sizes. Also as before, StrongSwan slightly outperforms OpenSwan at the smallest packet sizes, but then falls behind as the synchronous API to the SEC blocks the processor from doing other work.

The Mocana advantage over OpenSwan is approximately 1.9x at small-medium packet sizes. However, at larger packet sizes, Mocana hits the 200 Mbps link limit (bidirectional testing with two 10/100 fast Ethernet interfaces), allowing OpenSwan to appear to close the gap.

NetKey is slightly more efficient than OpenSwan when performing software encryption. However, the Native Linux Crypto API's dual pass, synchronous interface to the SEC creates such high overheads that NetKey with hardware acceleration is slower than the other stacks, and even slower than NetKey with software encryption at the smaller packet sizes.

1 | 2 | 3 | 4 | 5

Rate this article: Low High
Current rating
  • .
Embedded.com Career Center
Looking for a new job?
SEARCH JOBS

Browse all jobs

SPONSOR
RECENT JOB POSTINGS



TECH PAPER
WEBINAR
WEBINAR
WEBINAR




 :