Using the application modeling and mapping methodology for system-level performance analysis
The performance analysis methodology is based on a new solution for system level performance analysis called Application Task Mapping (ATM), which, Synopsys has added to their Platform Architect product line (formerly with CoWare). ATM enables the rapid creation of an executable system model to collect information about performance metrics like throughput, latency, and resource utilization. This way the system architect can evaluate the feasibility of the SoC architecture and the application partitioning.On top of the ATM capabilities, we at NXP have developed a methodology called Application Modeling and Mapping (AMM) that applies the capabilities of the Synopsys Platform Architect environment to signal processing applications. This methodology comprises libraries for modeling of signal processing tasks and customized analysis views.
Fig 1: AMM methodology overview.
As depicted in figure 1, the AMM methodology covers the following aspects:
- Creation of executable SystemC task graphs representing the Application Use-Cases;
- Mapping of the AUC models on the HW platform;
- Monitoring of performance metrics of processing elements and communication fabrics;
- Visualizing performance metrics at the adequate level to identify bottlenecks;
- Correlating the obtained metrics with the processes/channels of the AUC model;
- All in an environment that allows short and fast iterations.
The Application Use-Case describes only the subset of the functionality that is needed for accurate modeling of processing time and traffic generation. This is essentially an executable Data Flow Diagram (DFD) inspired by Hatley & Phirbai design method. AUC models are based on the AMM library written in SystemC, which consists of the following primitives:
- A Task contains the functional code for the data flow, control flow, computation time and latency constraints. A task can be also hierarchical to manage the complexity of real-work application models;
- A Data channel models memory based communication between tasks. The channel contains a user defined buffer, varying from a single word to multiple video frames;
- A Control channel models event based synchronization between the tasks. The event messages can carry (small) parts of control information.
Hardware platform and virtual processing unit
The HW platform consists of a mix of approximately timed and cycle accurate SystemC models. The used models are memory hierarchy, interconnect and processing elements, all from the Synopsys SystemC model library. The key element here is the Virtual Processing Unit (VPU) that represents a generic configurable processing element for the SystemC task graph. Any instance of a VPU can act as a shared processor executing multiple application tasks or as a dedicated hardware block (e.g. a Viterbi accelerator).
The VPU models all performance related aspects related to the execution of the mapped tasks. This comprises the processing delay, the scheduling overhead and the generation of realistic bus transactions.
Mapping
Mapping the AUC model on a HW platform is realized by assigning the AUC tasks to the respective VPUs and assigning the channel to the respective memories. Channels between tasks assigned to different VPUs need to be mapped to a suitable SoC-level communication and synchronization mechanism. This is realized by Drivers that convert the task-level communication of the AUC model into platform specific operations (e.g. an interrupt signal for control channels or a DMA transfer for data channels).
Separating the application model from the architecture enables the flexibility required for exploration, e.g. in case of DAB the flexibility to map one or two concurrent DAB streams onto one platform.
Monitoring and viewing
The SystemC models in the HW platform (VPU, interconnect, memories) are instrumented with analysis monitors to measure the performance of mapped application use cases. The monitors allow:
- Viewing dynamic aspects by recording detailed traces and aggregate statistical performance information over a configurable analysis interval.
- Recording relevant metrics like e.g. system-level end-to-end latencies and corresponding constraints.
- Recording at right granularity / detail to speed-up the analysis process.
- Correlating hardware performance data with the AUC processes and channels. This enables the system architect to relate hardware performance bottleneck (e.g. bus contention with the root cause in the application use-case).
In the remainder of this article we describe the application of the AMM methodology to the performance analysis of a multi-mode DAB radio.


Loading comments... Write a comment