CMP EMBEDDED.COM

Login | Register     Welcome Guest  
HOME DESIGN PRODUCTS COLUMNS E-LEARNING CONFERENCES CODE FORUMS/BLOGS NEWSLETTERS CONTACT FEATURES RSS RSS

Providing memory system and compiler support for MPSoc designs: Part 3
Compiler Support



Embedded.com

Optimizing Memory Space Utilization
Another important issue in an MPSoC-based environment is the management of limited memory space. It should be observed that optimizing for data locality does not mean that the optimized application will execute using fewer data items. To reduce data space requirements of a given application, one may need to consider lifetimes of program data structures/variables. In particular, if the lifetimes of two different data items do not overlap, they can potentially share the same memory location, thereby reducing overall memory space demand.

For array-based applications, in a given loop nest, one can define the lifetime of an array element as the difference (in loop iterations) between the time it is first assigned (written) and the time it is last used (read). For a given array index a (which might be multidimensional), the start of its lifetime is referred to as S(a), whereas the end of its lifetime is denoted using E(a) - both in terms of loop iterations. Using these definitions, the lifetime vector for this array element can be given as s = E(a) - S(a), where "-" denotes vector subtraction.

Note that the lifetime of a is expressed as a vector, as in general there might be multiple loops in the next, and expressing lifetime as a vector allows the compiler to measure the impact of loop transformations on it. As an example, if an array element (that is accessed in a nest with two loops) is produced in iteration (1 3)T and consumed (i.e., last-read) in iteration (4 5)T, its lifetime vector is s = (4 5)T - (1 3)T = (3 2)T.

It should be noted that before S(a) and after E(a), the memory location allocated to this array element could be used for storing another array element (which belongs to the same array or to a different array). Obviously, the shorter the difference between E(a) and S(a) the better, as it leaves more room for other array elements.

However, it is to be noted that reducing lifetimes of array elements does not guarantee memory location reuse. This is because in order to have memory location reuse, the lifetimes should be shortened such that the different elements can share the same memory location. As a result, it is important for the compiler to make sure that this happens (when lifetimes are reduced). That is, the compiler can check whether the lifetime reduction will really be beneficial; if not, it may choose not to apply it.

Along these lines, Lefebvre and Feautrier propose a method for automatic storage management for loop-based parallel programs with affine references. They show that parallelization by total data expansion can be replaced by partial data expansion without distorting the degree of parallelism.

 Grun et al.  have presented memory size estimation techniques for multimedia applications. Strout et al.  introduce the universal occupancy vector that allows schedule-independent storage space optimization. Their objective is to reduce data space requirements of a given code without preventing subsequent locality optimizations from being performed. Wilde and Rajopadhye have investigated memory reuse by performing static lifetime computations of variables using a polyhedral model. Lim et al. have defined an optimization method called the array contraction.

The work done by Song et al.  and Catthoor and his colleagues shows how loop fusion can be used for minimizing data space requirements. Along similar lines, Fraboulet et al. has also presented an algorithm to reduce the use of temporary arrays by loop fusion. They demonstrate that although the complexity of their algorithm is not polynomial, it is highly efficient in practice. Thies et al. has described a unified general framework for scheduling and storage optimization.

One of the problems that they address, namely, "choosing a store for a given schedule" tries to assign storage locations to array elements with storage size reduction in mind. It should also be observed that reducing memory space requirements alone might help improve data locality (cache behavior). This is because in array-intensive applications a significant portion of data cache misses (called capacity misses) occurs because large datasets are accessed in a short period of time (e.g., within a loop). By reducing the number of array elements, one can also improve cache behavior.

We also need to point out an important tradeoff that needs to be made for an MPSoC-based execution environment. In general, more memory space means more parallelism, as discussed in studies such as those of Tu and Padua and Barthou et al. This is because a large percentage of data dependences in a given application are anti-dependences and output-dependences, which occur mainly due to reuse of the same memory location for different data items.

By allocating a separate memory location for each data item, one can eliminate these dependences. However, as we have discussed above, reusing the same memory location for different data items is an important optimization as far as memory space utilization is concerned. Therefore, an optimizing compiler should be able to resolve this tradeoff considering the performance and memory space constraints at the same time.

1 | 2 | 3 | 4

Rate this article: Low High
Current rating
  • .
Embedded.com Career Center
Looking for a new job?
SEARCH JOBS

Browse all jobs

SPONSOR
RECENT JOB POSTINGS





 :