Taking a multicore DSP approach to medical ultrasound beamforming
The basics of beamformingBeamforming is the signal processing heart of any medical ultrasound machine. It is usually implemented in FPGAs and ASICS due to the large bandwidth and computational requirements. The recent advances in digital signal processors (DSPs) opened the door for beamforming on general processing chips.
The medical ultrasound images are formed by first creating focused ultrasound beams in transmit and directivity patterns (in the same direction as for transmit) on receive. The area of interest is covered by adjacent beams with a spacing defined by the minimum resolution required. The ultrasound beams are formed by adding delays to the electrical pulses sent to each element with the purpose of controlling the radiation pattern.
On the receive side, the approach is similar; the signals from each element are delayed with the same amount as on transmit and summed together. The delay phase insures that the waves/signals are all in phase and that no destructive adding takes place (Figure 5 below).

Figure 5 – Rx –Beamforming high level diagram
The beamforming process can be elegantly put into the following formula:

where N is the number of elements, Ai are the apodization coefficients (weights), si are the received echoes and t
The first step to be done when analyzing beamforming is defining the physical parameters of the system and the features that are to be included. We will consider a 30 channel system with ADCs running at 32MHz and 16bit resolution for a central frequency of 2MHz.For the use case presented, dynamic focusing and apodization with coefficients updates every 0.96mm and interpolation at every cycle is considered. The depth of scan and view angle are depicted in Figure 6 below.

Figure 6 – Geometry parameters
Based on the Fraunhofer approximation the required number of lines to sample the sector can be calculated as:

where θ is the sector angle, At and Ar are the transmit and receive apertures and α is the wavelength. Due to the detection process that implies usually squaring of the echoes and logarithmic compression the actual number of lines needed to sample the sector θ is twice the value of N. For the use case presented a sufficient number of line is 61.2 but since we are dealing with an approximation 64 lines will be considered in the following exposition.
To avoid a drop in frames/second the data that has been acquired has to be processed until new data is transferred as schematically pictured in Figure 7 below.

Figure 7 – Schematics of the pipelined signal processing approach; Time frame allocation
A first step to consider is the time frame allocation for beamforming and B-Mode library based on the number of frames/second that needs to be supported. For this use case a number of 45 frames /second is considered as adequate. After dividing the number of frames/second to the number of scan lines we can allocate the time frame for beamforming module and for B-Mode libraries as presented in Figure 7 above.
When estimating that certain algorithms are feasible for certain architecture three factors are most critical: IO bandwidth (if it is the case), memory bandwidth requirements and the cycle count needed to execute the algorithm.
The IO bandwidth requirement can be calculated based on the timeframe allocation and ADC parameters as being 8.07Gbps well in bounds of the accepted for one Rapid IO port (measured speed of up to 9.11Gbps.
The memory where the input data is stored will be M3, since it is quicker than DDR3 and features sufficient space to support even a double buffering approach if needed. Since the transfer will be done via DMA, in the following no module will use a DMA approach so that the Rapid IO transfer is not interrupted.


Loading comments... Write a comment