# A Reduced Multiplier Beamformer Architecture for Ultrasound Imaging Systems

David P. Magee, Member, IEEE and Murtaza Ali, Senior Member, IEEE

*Abstract*— This paper presents a new ultrasound beamforming architecture that greatly reduces the number of multiplications in a DAS (Delay And Sum) implementation as MLAs (Multiple Line Acquisitions) and data channels increase in the system. A mathematical derivation is provided for the new DAS-DPC (Data Path Combined) beamformer architecture along with multiplier analysis that compares the new architecture to a standard DAS implementation. Simulation results using a kidney image from a well-known simulation tool called Field II are given to demonstrate the effectiveness of the new beamforming architecture as compared to a standard DAS architecture.

## I. INTRODUCTION

TLTRASOUND is one of the most widely accepted forms of medical imaging technology because of its ability to perform real-time imaging, operate without any known sideeffects and cover a wide range of diagnostic as well as therapeutic applications. It is commonly used in cardiology to study heart functionality and diagnose heart disease. In endocrinology, ultrasound is used to study glands such as the thyroid for possible tumors or cysts. Gastroenterology ultrasound applications involve the digestive system where the esophagus, stomach or pancreas can be examined. Musculoskeletal analysis involves the analysis of tendons, nerves, muscles and bone surfaces. Intravascular study will allow doctors to guide injections of fluid or perform delicate surgical procedures. Probably the most widely identifiable ultrasound application is obstetrics, which involves the visualization of a fetus in its mother's womb. All of these applications, as well as many others, continue to demand better imaging and diagnostic capabilities.

Ultrasound was first used for medical diagnosis of the human body by Dr. George Ludwig in the 1940's for detecting and locating gallstones in soft tissue [6]. Since that time, ultrasound systems have incorporated state-of-theart components to achieve their design goals. As more components in the system converted from analog to digital circuitry, the need for commercialized devices grew. Today, with the demand for higher channel counts, more frames per second, better image resolution, portability and lower power, the requirements on these systems have never been higher. DSPs (Digital Signal Processors) have been widely used in various commercial and medical applications to enable these types of features. DSPs have historically provided good MIPS (Millions of Instructions Per Second)/mW performance while providing the needed flexibility in systems where system life cycles can be five to ten years.

With respect to ultrasound beamforming, this functionality has traditionally been performed in custom ASICs (Application Specific Integrated Circuits) due to insufficient MAC (Multiply-ACcumulate) capability in DSPs. However, new chips on the market are beginning to show promise. But even with these new devices, conventional beamforming architectures are not well suited to handle the demands of future ultrasound systems in terms of power, channel count and flexibility. As a result, this paper addresses some of these concerns by developing a new beamforming architecture that can scale better with increased data channel count and MLAs.

This paper is organized as follows: section II discusses DAS beamforming and some of the limitations that arise as the MLAs and channel counts increase. Section III derives the DAS-DPC beamformer and discusses some of its benefits. Section IV provides results from a simulated kidney image and compares the multiplier requirements for the beamformer methods. Section V provides conclusions and discusses opportunities for future work.

# II. DAS BEAMFORMING

A common ultrasound receive beamformer implementation is to delay and sum the signals from the transducer elements in such as way as to produce a signal that has been focused at each sample point along a desired scanline. In some cases, multiple receive scanlines are generated for each set of transmitted sound waves that are sent by the transmit beamformer. This mode of operation is commonly called MLA (Multiple Line Acquisition) mode [5]. Figure 1 shows a block diagram for a typical DAS receive beamformer. The analog signals from each transducer element pass through the AFE (Analog Front End), which includes a LNA (Low Noise Amplifier) and VCA (Voltage Controlled Attenuator), before being converted to digital samples. The data path also contains analog filtering, which is not shown in the figure.

Each of the analog signal paths serves as a channel of data to the receive beamformer. After analog to digital conversion, the digital samples are delayed and filtered to provide the desired integer and fractional sample offsets needed to perform the coherent summation at the last step of the beamformer. In the most general sense, the integer and

D.P. Magee and M. Ali are in the DSP Solutions R&D Center at Texas Instruments, Dallas, TX, 75243. (e-mail: magee@ti.com).

fractional delay values are a function of sample number to enable dynamic receive focusing [3].

Before the data values are summed together, they are scaled by the apodization gains. These gains can be a function of sample number to enable dynamic aperture control as a function of depth [4]. Once the samples have been scaled, they are summed together to give a sequence of beamformed samples. This DAS beamformer design will serve as the baseline architecture for comparisons made later in this paper.



Figure 1 DAS Beamformer Block Diagram

Several observations can be made about this baseline receive beamforming architecture. First, each channel of data has its own set of filters whose number is a function of the desired timing resolution of the beamformer. The filters have to be implemented so that at each sample instance the coefficients can be changed to achieve the desired fractional delay for that sample. Also, to support multiple line acquisition, either the non-beamformed samples must be buffered and processed for each set of fractional delay, integer delay and apodization gain values or each data path must be able to accommodate L parallel data paths, where Lis the number of MLAs. As a result, the data control logic becomes more complicated and the amount of hardware required to implement the solution begins to grow.

To represent the complexity of this solution, consider the mathematical representation of a DAS beamformer given by the following equation

$$z[n] = \sum_{m=0}^{M-1} a_m [n] \sum_{k=-\infty}^{\infty} h_{m(n)} [k] x_m [n-k-d_m[n]]$$
(1)

where  $x_i[n]$  is the signal from the *i*<sup>th</sup> receive channel at time sample *n*,  $h_i[k]$  is the *k*<sup>th</sup> coefficient of the interpolation filter for the *i*<sup>th</sup> receive channel at time sample *n*,  $d_i[n]$  is the integer delay for the *i*<sup>th</sup> receive channel at time sample *n*, z[n] is the beamformed signal at time sample *n* and *M* is the number of receive data channels. The number of multiplies, *numMults*, for this architecture can be written as

$$numMults_{DAS} = L \cdot M \cdot (K+1) \cdot N \tag{2}$$

where K is the number of interpolation filter coefficients, L is the number of MLAs, M is the number of receive data channels and N is the number of output samples. Notice that the number of multiplies is directly proportional to each one

of these terms. So as the number of data channels increases in the system, so does the number of multipliers required to implement the solution. The DAS-DPC architecture described in the next section attempts to reduce the number of multipliers in the architecture to enable higher channel count beamformer implementations in future generation ultrasound systems.



Figure 2 DAS-DPC Beamformer Block Diagram

### III. DAS-DPC BEAMFORMING

To improve upon the DAS beamformer architecture, one must recognize that the number of filtering operations cannot continue to scale linearly with the number of data channels. But more fundamentally, one must also recognize there are only P unique interpolation filters in the architecture, where P is the ratio of  $T_s$  to  $T_{res}$ , where  $T_s$  is the sampling period of the ADCs and  $T_{res}$  is the desired time resolution of the beamformer.

If the summations in the mathematical representation of the DAS beamformer in Equation (1) are swapped, the equation can be expressed as

$$z[n] = \sum_{k=-\infty}^{\infty} \sum_{m=0}^{M-1} a_m[n] h_{m(n)}[k] x_m[n-k-d_m[n]]$$
(3)

By providing a mapping for the  $m^{th}$  receive signal to the  $p^{th}$  interpolation filter, the equation becomes

$$z[n] = \sum_{k=-\infty}^{\infty} \sum_{m=0}^{M-1} a_m[n]h_{p(m,n)}[k]x_m[n-k-d_m[n]] \quad (4)$$
  
where  $p(m,n) \in \{0,1,...,P-1\}$  and  $P = ceil \begin{pmatrix} T_s \\ T_{res} \end{pmatrix}$ . A

mapping, I(p,s), can be used to convert from the  $p^{th}$  group of receive signals to the original  $m^{th}$  receive signal so that the equation becomes

$$z[n] = \sum_{k=-\infty}^{\infty} \sum_{p=0}^{P-1} h_p[k] \sum_{s=0}^{S(p,n)-1} a_{I(p,s)}[n] x_{I(p,s)} \Big[ n-k - d_{I(p,s)}[n] \Big]$$
(5)

where S(p,n) is the number of receive signals using the  $p^{th}$  interpolation filter at time sample *n*. This representation can be expressed in a more common filtering form as

$$z[n] = \sum_{k=-\infty}^{\infty} \sum_{p=0}^{P-1} h_p[k] z_p[n,k,d]$$
(6)

where

$$z_{p}[n,k,d] = \sum_{s=0}^{S(p,n)-1} a_{I(p,s)}[n] x_{I(p,s)}[n-k-d_{I(p,s)}[n]]$$
(7)

and *P* is the number of interpolation filters needed to achieve the desired beamforming timing resolution.

Equation (6) can be expressed in block diagram form as shown in Figure 2. For each sample instant, the appropriate filter inputs are summed together first before the filtering operation occurs. As a result, this implementation is called a DPC (Data Path Combined) beamformer to demonstrate that the input data paths for each interpolation filter have been combined before the filtering operation is performed at each sample instant. As a result, the number of multipliers has been drastically reduced when compared to the baseline architecture. The total number of multiples for this architecture can be represented by the following equation

$$numMults = (M + L \cdot P) \cdot K \cdot N \tag{8}$$

If the ratio of Equation (8) to Equation (2) is formed, the relative multiplier performance of the DAS-DPC beamformer architecture can be compared to the DAS beamformer architecture. This relationship can be approximated as

$$\frac{numMults_{DAS-DPC}}{numMults_{DAS}} \approx \frac{1}{L} + \frac{P}{M}$$
(9)

to establish some multiplier trends for the DAS-DPC architecture. The first trend is that for a given number of interpolation filters (P) and a given number of data channels (M), the DAS-DPC architecture requires fewer multiplies as the number of MLAs (L) increases beyond one. The second trend is that for a given number of data channels (M), the DAS-DPC architecture requires fewer multiplies as long as the number of interpolation filters (P) is less than the number of data channels (M). The final trend is that for a given number of interpolation filters (P) (i.e. for a given beamformer time resolution), the DAS-DPC architecture again requires fewer multiplies as the number of data channels (M).

TABLE I Multiplier Performance Comparison

| Parameters              | Configuration 1 | Configuration 2 | Configuration 3 |
|-------------------------|-----------------|-----------------|-----------------|
| MLAs (L)                | 1               | 4               | 8               |
| Number of Filters (P)   | 10              | 10              | 10              |
| Filter Coefficients (K) | 8               | 16              | 32              |
| Number of Channels (M)  | 128             | 128             | 128             |
| Number of Samples (N)   | 8,000           | 8,000           | 8,000           |
|                         |                 |                 |                 |
| DAS-DPC (MMults)        | 8.832           | 21.504          | 53.248          |
| DAS (MMults)            | 9.216           | 69.632          | 270.336         |
| % Improvement           | 4.2%            | 69.1%           | 80.3%           |

Table I shows the multiplier comparison between the two architectures for three different, high performance ultrasound system configurations. This table brings to light just how significantly the DAS\_DPC architecture can reduce the number of multipliers required to perform beamforming when compared to the conventional beamformer architecture. Notice that as the number of MLS goes from one to eight, the number of multiplies is reduced by 76%. Furthermore, these results are quite promising for future ultrasound applications where the desired channel counts will be in the thousands of data channels.







Figure 4 Simulated Kidney Image using DAS-DPC Beamformer

#### IV. SIMULATION RESULTS

To demonstrate the effectiveness of the new receive beamformer architecture, the Field II Matlab ultrasound simulation program [1,2] was used to generate kidney images using the DAS beamformer (Figure 3) and the DAS-DPC beamformer (Figure 4). The following simulation parameters were used (sampling rate = 100 MHz, 128 scanlines, a 90° scan width) and the following beamformer parameters( MLAs = 1, number of filters = 10, number of filter coefficients = 8) were used in the simulations. By using the calc\_scal\_multi() procedure, signals from each element in the receive aperture can be obtained from the Field II simulation program. The ele\_delay() procedure was also used to remove delay associated with the elements. To produce an image from the beamformed data, conventional envelope detection, data compression and interpolation were implemented before scan converting the data to the figure window dimensions as shown in Figure 4. The images look very similar to the human eye so other numerical performance results are provided.



Figure 5 Beamformed Data Comparison: Scanline Number 30

Figure 5 shows a comparison of the beamformed data from the DAS beamformer and the DAS-DPC beamformer for an arbitrary scanline (number 30) of the simulated kidney image. The data is basically the same to the naked eye so some error analysis is provided in Figure 6.



Figure 6 Beamformer Error Performance: Scanline Number 30

Figure 6 shows the error between the DAS beamformed signal and the DAS-DPC beamformed signal for scanline number 30 of the simulated kidney image. The maximum error for this scanline is roughly 2.49e-14. As a result, the signals are identical for all practical purposes (as they should be given the mathematical derivation in Section III). It is worth noting that the fixed point implementation differs by one or two LSBs (Least Significant Bits) due to the differences in the fixed point filtering operation. In the DAS implementation, only one signal is filtered at a time.

DAS-DPC implementation, multiple signals are added together before filtering. Thus, the output will be slightly different in the fixed point domain.

The error performance of the DAS-DPC beamformer signal as compared to the DAS beamformed signal is provided in Table II. The NMSE (Normalized Mean Squared Error), the MAE (Maximum Absolute Error) and the NMAE (Normalized Maximum Absolute Error) values are well within the acceptable limits for this type of signal processing operation.

TABLE II ERROR METRICS: SCANLINE NUMBER 30

| Error Metrics |          |          |  |
|---------------|----------|----------|--|
| NMSE          | MAE      | NMAE     |  |
| 1.97E-31      | 2.49E-14 | 1.48E-15 |  |

#### V. CONCLUSION

This paper presented a new beamforming architecture that greatly reduces the multiplier count in a DAS beamformer implementation as the number of MLAs and data channels increase in an ultrasound system. A mathematical derivation was provided for the new DAS-DPC beamformer architecture and analysis was given that compared the new architecture to the implementation. Simulation results using a kidney image from a well-known simulation tool were provided to demonstrate the effectiveness of the new beamforming architecture. Future work will be to demonstrate this beamforming architecture on а programmable processor such as a DSP.

#### REFERENCES

- Jensen, J.A., "Field: A Program for Simulating Ultrasound Systems", Proceedings of the 10<sup>th</sup> Nordic-Baltic Conference on Biomedical Imaging Published in Medical & Biological Engineering & Computing, Vol. 32, Supplement 1, Part 1, 1996, pp.351-353.
- [2] Jensen, J.A, and N.B. Svendsen, "Calculation of Pressure Fields from Arbitrarily Shaped, Apodized, and Excited Ultrasound Transducers," *IEEE Transactions in Ultrasonics, Ferroelectric, and Frequency Control*, Vol.39, No.2, March 1992, pp.262-267.
- [3] McKeighen, R.E. and M.P. Buchi, "New Techniques for Dynamically Variable Electronic Delays for Real Time Ultrasonic Imaging," *Proceedings of the 1977 Ultrasonics Symposium*, Toronto, Ontario, October 5-8, 1997, pp.250-254.
- [4] Thomas, C.E., "Dynamic Array Aperture and Focus Control for Ultrasonic Imaging Systems," United States Patent No. 4180790, December 25, 1979.
- [5] Thomenius, K.E., "Evolution of Ultrasound Beamformers," *Proceedings of the 1996 IEEE Ultrasonics Symposium*, San Antonio, TX, November 3-6, 1996, pp.1615-1622.
- [6] "Device with Radar Principle Detects Foreign Objects in Body Tissue," Department of Defense, Office of Public Information, No. 220-49, October 9, 1949.