Introduction
While microphone arrays have been around for more than 50 years [1], the landscape of microphone array sensors and its technology have advanced tremendously in the last two decades with the rise of MEMS (micro-electro-mechanical system) technology. Furthermore, the last decade has given rise to many novel 3D in-air ultrasound sensors which allow the formation of acoustic images in 3D. These sensors hold great promise for robotic applications in harsh environments, as ultrasound signals are minimally affected by medium distortions such as dust, fog and water spray. However, the sensors developed in the past typically use a reduced aperture due to cost and complexity limitations, with microphone counts typically ranging from 1 to 64. These reduced apertures inevitably cause either artifacts in the resulting 3D images, or images with a limited dynamic range and spatial resolution.
Ultrasound signals often exhibit a large Helmholtz number in relationship to the environments where they are applied, implying that the reflected energy impinging on the sensor is mainly specular in nature. On the other hand, diffraction echoes should arise from acoustic theory [2], but so far these echos have been mostly neglected due to their low intensity. In order to assess the relative importance of these echoes in real-world environments, as well as to investigate what the virtual upper limit is of ultrasound sensing in real-world environments, a sensor with a high spatial resolution and high dynamic range is necessary.
This paper tries to address the need for a sensor with high spatial resolution and dynamic range, and introduces a dense, large aperture in-air ultrasound microphone array which should provide these high spatial resolutions, dynamic ranges and signal to noise ratios. The system is consists of 1024 synchronously sampled microphones, increasing the number of microphones of our previously developed systems by a factor of 32 [3], [4], [5]. This sharp rise in microphone channel count is achieved by leveraging a distributed hardware architecture, which is built upon a decade of ultrasound sensor development. In this paper, we demonstrate a successful implementation of this novel acoustic sensor, and demonstrate its functionality through both simulation and real-world measurements.
In order for the readers to accurately follow the developments, we encourage them to get familiar with our previous work in which we describe in detail the development of a single 32-channel microphone array [3], [6], as the sensor in this paper consists of a distributed version of that single 32-channel module. However, this paper still stands on its own, allowing the reader to follow the development of the data-acquisition methods and signal processing techniques and understand the performance analysis of the system where we compare it with our previously developed 32-channel microphone array [3], [6].
As we have argued before [7], for real-time systems using acoustic sensing it is crucial that the sensor samples the full wave field using a single measurement. This approach is distinctly different from Synthetic Aperture Sonar (SAS) techniques such as the ones described in [8] and [9]. In SAS, an array with an increased aperture is created by using platform motion. Indeed, by moving the sensor, the individual array elements sample different positions in the wavefield, and the aggregated data can then be used as if sampled from a single contiguous array. However, this requires the wave-field to be stationary over the integration time, which is, as we argued before [7], not the case in many real-world applications such as robotics and predictive maintenance.
In the pages that follow, we will touch on the design choices that were made to achieve the hardware architecture of the developed ultrasound sensor unit that we named the High Resolution Imaging Sonar (HiRIS) sensor, together with a more detailed description of the implementation. In the subsequent section, the data acquisition and signal processing are described followed by a Section on the experimental setup and its results. In the final section, we will present the conclusions of the proposed system and its envisioned applications as future work.
Hardware Architecture
Achieving the envisioned objective of constructing a synchronized ultrasound sensor array featuring 1024 microphone channels, coupled with a versatile yet timely data transfer interface, poses a nontrivial challenge. It requires considering several trade-offs in different design aspects, such as the choice of components and their associated costs, design time influenced by familiarity with a platform, as well as the time allocated for implementation and testing. This Section aims to delve into the deliberations behind these design choices, exploring considerations related to component types, cost implications, design familiarity impact on time, and the overall implementation process. Additionally, we will introduce the selected implementation and provide an overview of the proposed system.
The hardware design of the HiRIS sensor is a highly complex one, which warrants its own extensive description as the devil is in the details. Indeed, the overall system consists of 1024 microphones, 33 microcontrollers, 4963 electronical components, distributed over 2 PCBs, and over 127m of PCB traces. The road to a successful implementation of such a system is riddled with pitfalls, which we aim to clarify in the subsequent sections.
A. Design Choices
Over the last decade, the embedded products market has witnessed a notable surge in diversity, propelled by swift technological advancements. The integration of highly capable and feature-rich (ARM) microcontrollers, along with Field-Programmable Gate Arrays (FPGAs) and System on Chips (SoCs), has been instrumental in enhancing the capabilities of embedded sensor systems. The emergence and widespread adoption of Single Board Computers (SBCs), coupled with the proliferation of Internet of Things (IoT) devices, have been spurred by the demand for technological progress in the era of Industry 4.0. Concurrently, the growing community of online hobbyists in the domain of embedded electronics hardware has contributed to the development of tools and libraries, facilitating the rapid creation of embedded platforms.
While the three aforementioned types of embedded devices, being FPGAs [3], [4], [10], [11], [12], SoCs [13], [14], [15] and ARM microcontrollers [5], [16], [17], [18], [19], have been used for the construction of high-resolution ultrasound sensing arrays, each of these device types have their distinct advantages and disadvantages.
Field-Programmable Gate Arrays (FPGAs) offer notable advantages in terms of flexibility and customization. Their field-configurable nature, in combination with very high GPIO pin counts, allows for their rapid adaptation to diverse tasks, especially in real-time applications with very tight timing constraints. However, the complexity of FPGA design, coupled with the relatively higher power consumption and component cost can be considered drawbacks. Indeed, developing complex hardware designs in FPGAs is complicated due to the need for tight timing closures, in order to yield stable data-acquisition systems.
Systems-on-a-Chip (SoCs) integrate multiple components, processing cores and peripherals onto a single chip, streamlining the design and reducing the need for external components. This leads to space and power efficiency. While the large amount of diverse peripherals on the SoC is attractive, the amount of customization options are significantly more limited when compared to FPGAs. Therefore, complexity in the envisioned design may lead to significant challenges during the development. On the other hand, timing closure is guaranteed by design, leading to far less potential for race conditions compared to FPGA-based designs.
ARM microcontrollers excel in power efficiency and simplicity, due to their standardized architecture and interface design. The low cost of ownership, coupled to wide industry adoption make them accessible for a wide range of applications, from automotive, over consumer goods, and indeed, to high-speed data-acquisition. This high degree of standardisation leads to a reduced customization potential when compared to FPGAs, or the integrated capabilities of SoCs, limiting their suitability for certain high-performance or specialized tasks.
Despite the apparent drawbacks of ARM-based microcontroller systems, we deemed this to be the most promising candidate for the development of the hardware architecture for HiRIS. While the other two options (being SoCs and FPGAs) are certainly viable options for implementing such a systems, we chose for an ARM-based architecture, because of the fact that a) the peripherals on the chosen ARM platform are ideally suited for our intended application, b) a distributed architecture is more error robust than a single monolithic implementation, and c) our group is well versed in the development of ARM-based systems, which is a non-neglectable reason for choosing a particular approach.
B. Distributed Architecture
When considering the design of complex hardware systems, it often pays off to approach the implementation using a distributed architecture. Indeed, when using a distributed architecture, robustness increases due to the lack of single point-of-failures. In the case of the embedded hardware design of HiRIS, it can be beneficial to split up the hardware over multiple printed circuit boards (PCBs) that are tied together using one or multiple appropriate connectors. The hardware components can be grouped by functionality and can hence be isolated in the design process, which in turn can have advantages during the implementation and testing phase. This is especially important for testing individual boards with high component counts, as this distributed approach allows them to be tested without inducing dangerous voltages or currents to other parts of the device. Furthermore, sections of the design can easily be redesigned if deemed necessary after testing (i.e., it facilitates an iterative design approach), without having to reassemble the non-faulty parts of the system.
This modular approach also has the added benefit of being able to make use of an extra spatial dimension in the hardware design by connecting multiple PCBs on top of each other, reducing the surface of the total design to its volume. When designing acoustic array sensors of the proposed complexity encountered in HiRIS, we often separate the microphones and some of their essential peripherals to a front-end PCB, and place the rest of the electronic components to a so called back-end PCB. As a beneficial side effect, this creates the potential for leaving front-facing side of the front-end PCB component-less, which is essential for eliminating distorting multipath effects in the acoustic reception pathways.
Another advantage of the distributed architecture can be found in reusing known, verified and tested schematic and component layouts, used extensively in previous designs (i.e. the so-called battle-tested designs). By reusing parts of both the schematics and component layouts from previously built hardware, these parts can be distilled into design blocks, which then can be combined in the larger overarching design. These design blocks allowed us to quickly create a distributed hardware architecture of 32 microphone nodes by 33 microcontroller nodes, where every 32-element microphone node on the front end PCB is connected with an ARM microcontroller node with its peripherals on the back-end PCB. To orchestrate the 32 microcontroller nodes, an additional primary node was added on the back end. Employing this distributed design method of reusing existing design blocks has proved to be a highly productive and cost-efficient design methodology.
C. Front end
The front-end board mainly incorporates 1024 Knowles SPH0641LU4H-1 MEMS microphones, sixteen AP2112K-3.3V linear voltage regulators that convert +5
The aforementioned MEMS microphones are configured in a 32-by-32 uniform rectangular array with regular grid spacing of 3.9 \begin{equation*} d = \frac {\lambda }{2} = \frac {v}{2f} \quad \Leftrightarrow \quad f = \frac {v}{2 d} \tag{1}\end{equation*}
Besides the small form factor, low power consumption and a frequency response curve reaching far into the ultrasonic spectrum [4], the key advantage of the SPH0641LU4H-1 microphones is their built-in
Using microphones with a built-in 1-bit ADC has as a major advantage a significant reduction in board complexity. Indeed, if microphones with an analog voltage output were to be used, each of these microphone signals would need amplification and a dedicated ADC chip, which adds a significant amount of complexity (as demonstrated in our earlier designs [7], [25]). Interfacing with 1-bit signals can be easily done using a wide GPIO register on a microcontroller. Further reduction of the necessary GPIO lines can be realized by utilizing the stereo-capability of the used PDM microphones. When using this implemented feature, one microphone will deliver its data on the rising edge of the clock signal where the other will deliver it on the falling edge. The latching of the data will occur on the opposite edges of the clock signal of the microphones. This stereo setup halves the number of data lines that are required from 1024 (without using the stereo feature) to 512 (when using the stereo feature). It should be noted that the while the induced phase differences of sampling on both the rising and falling edge of the data sampling is 180°, this equates to 10
In addition to the aforementioned voltage regulators, microphones and various passive components e.g. resistors and decoupling capacitors, eight high density FX10A-120P connectors can also be found on this PCB to connect the power, multiple synchronous clock lines and data lines to the back-end PCB of this design. These connectors ensure a high-fidelity link of the digital signals from the front end to the back end, ensuring robust operation of the sensor during field-trials.
D. Back end
The back end of the HiRIS sensor can be split up in a primary node and 32 subordinate nodes. The latter are 32 identical design blocks with the STM32F429 ARM Cortex M4 microcontroller at its core, in combination with an IS42S16320D external SDRAM memory of 64
The primary node, that also has a STM32F429 ARM Cortex M4 microcontroller at its core, uses a single timer peripheral that generates 4 synchronous square wave signal outputs at 4.5
Besides a USB connection to the primary node for initiating measurements, an external TTL-input can be used to trigger measurements, which allows for easy integration of the HiRIS sensor in measurement pipelines. For increased robustness, e.g. long cable lengths or noisy environments, the option of using differential signaling for external I/O was chosen by incorporating a SN65HVD77DR RS-485 driver into the design. Since this is a full-duplex interface, a pulse can also be generated to trigger external devices along with the subordinate nodes.
While the HiRIS is designed as a passive measurement device, a BNC connector was also fitted to the back end that is connected to the analog DAC-output of the primary node. The DAC peripheral can be triggered simultaneously with the subordinate nodes where it will generate an analog signal on its output based on a sequence that was either pre-defined in the firmware or uploaded to the primary node through its USB interface. This enables us to use this sensor as a high-channel pulse-echo sonar device when combined with an external amplifier and ultrasound transducer, similar to the sensors described in our earlier work [3], [4], [5], [7], [17], [25].
E. Physical Realisation of the HiRIS Sensor
The proposed 1024-microphone ultrasound array sensor, referred to as the HiRIS sensor (High-Resolution Imaging Sonar), comprising of the front and back end PCBs measures 180
As mentioned in the previous subsection, the HiRIS sensor can be expanded by connecting external devices through its external I/O or BNC connections but could also be further expanded with an additional PCB that stacks on the backside of the back end. This envisioned additional PCB would incorporate multiple USB3 hub ICs in order to reduce the amount of cable clutter. Another feature that would be integrated is a JTAG SWD programmer in combination with a multiplexer to alleviate the tedious work of plugging and unplugging the programmer when pushing firmware changes to the microcontrollers on the back end.
Data Acquisition and Processing Chain
In this section, we will detail the process of initiating and capturing a set of waveform data from the microphone array, and the subsequent processing using a bank of adaptive spatial filters (MVDR beamforming) for 3D image generation.
A. DAQ
The HiRIS sensor comprises of 33 microcontrollers (1 primary node and 32 subordinate nodes), each connected over a USB2.0 connection to a host PC. To aggregate all the USB connections, a chain of USB3 hubs is used, which aggregates all the USB connections to a single USB3.0 connection. The USB protocol is a CDC Virtual Com Port (VCP) emulation [26], each initializing a virtual serial port on the host PC. A custom Python script using the Multiprocessing API [27], looks for specific serial ports connected to the system with specific Vendor ID and Product ID combinations and opens all these ports, which allows for bidirectional communication with all the sensor nodes.
As stated before, the primary-subordinate architecture of our sensor implies that the single primary node of the HiRIS sensor listens to a command originating from a controlling PC over the VCP. In turn, the primary node asserts a trigger pulse to the 32 subordinate nodes, which each perform a measurement of a set duration (typically 70
B. Image Formation using MVDR
To process the massive amount of microphone data into a spacial spectrum, we follow an approach similar to the one outlined in [17]. The microphone signals are PDM modulated using single-bit \begin{equation*} s_{M,i}(t) = h_{ \mathit {PDM}} * s_{ \mathit {PDM},i}(t) \tag{2}\end{equation*}
\begin{equation*} s_{ \mathit {MF},i}(t) = \mathcal {F}^{-}1 \bigg [\mathcal {F}[s_{b}(t)]^{\ast} \cdot \mathcal {F}[s_{ \mathit {MF},i}(t)]\bigg] \tag{3}\end{equation*}
\begin{equation*} X(f,t) = \begin{bmatrix} x_{1}(f,t) &\quad x_{2}(f,t) &\quad \ldots &\quad x_{k}(f,t) \end{bmatrix} \tag{4}\end{equation*}
\begin{equation*} w_{ \mathit {MVDR}}(\psi)=\frac { R_{b}^{-1} \cdot A(\psi)}{A(\psi)^{H} \cdot R_{b}^{-1} \cdot A(\psi)} \tag{5}\end{equation*}
Verification of HiRIS
A. Simulation of Array Responses
In order to verify the operation of the HiRIS sensor, a simulation model of the sensor was built, following the equations derived in [28]. We calculate a so-called array responses (sometimes referred to as Point Spread Functions) 5], [31 of the sensor system, which describes the image obtained by the sensor in response to a Dirac-like point source in space. We placed the point source in three spatial locations, defined by their azimuth angle (
Overview of the HiRIS hardware architecture. Panel a) show a schematic representation of the system architecture, distinguishing between the front-end and back-end PCBs. On the front-end, there are 32 groups of 32 microphones (each arranged in an
Coordinate system in relationship to the HiRIS sensor, showing the X, Y and Z axis, and the azimuth angle
Array responses of a scene with a single point sources placed at different spatial locations, and a spatially uniform noise field: panel a & d)
The realized prototype of HiRIS. Panel a) shows the front-view of the sensor with the microphone port-holes, and the 33 USB cables used to connect the nodes to the USB hubs. Panel b) shows the backside of the back-end PCB, with the USB cables connecting all the nodes to the USB hubs,and shows the copper cooling solution provisioned for heat management. The four USB hubs are then connected to an aggregate USB hub, which is connected to the host computer. Panel c) shows the front-view of the HiRIS sensor, where the component-less front-side is visible, with the exception of the holes for the bottom-mounted MEMS microphones. Panel d) shows the cluttered office space which has been ensonified during the active measurement experiment.
B. Real-world validation: Setup
The realized prototype of HiRIS can be seen in figure 4. Panel a) shows the front-view of the sensor with the microphone port-holes and the copper slabs used for cooling. Panel b) shows the backside of the back-end PCB, with the USB cables connecting all the nodes to the USB hubs. These four USB hubs are then connected to an aggregate USB hub, which is connected to the host computer.
C. Real-world validation: Passive measurement
In order to validate the Point-Spread Function of the realized prototype, we performed a passive acoustic measurement using an 40-
Point-Spread functions for reduced aperture arrays. Panel a) shows the response of a regular
Figure 5 shows the effect of aperture size on the point-spread function. As expected from Fourier theory, with reducing aperture size the main lobe of the point-spread function becomes wider. Also, when the spatial sampling does not adhere to the spatial Nyquist theorem (which is the case in panel b), grating lobes occur (which is again expected behavior from Fourier theory). Finally, the dynamic range increases severely with microphone count, as shown in panel c) which shows the HiRIS response. We calculated the −3dB opening angles for the various arrays. For the smallest array in panel a), the opening angle is approximately 20
D. Active Measurements
As a final experiment, we performed an active measurement. In this case, the HiRIS sensor uses a Senscomp 7000 transducer [3], [5], [7], [25], [31], [35] to emit a broadband hyperbolic chirp. This chirp is generated by the DAC of the primary node and amplified using a custom high-voltage amplifier to a signal with an amplitude of 200
Experimental results of the HiRIS sensor. Panels a-c show the response of a 40-
Discussion and Future Work
In this paper, we have presented HiRIS, the High Resolution Imaging Sonar, a sonar sensor with 1024 microphones. This microphone array is, to the best of the knowledge of the authors, the largest microphone array developed for ultrasound imaging in air to date of writing. We detailed the hardware architecture of the HiRIS sensor, indicating design choices and potential pitfalls when reproducing the hardware system. We provided a reasoning on why certain design choices have been made, and which can be used to inform future decisions when building similar hardware systems. Furthermore, we detailed the data-acquisition pipeline and signal processing approach, and tried to develop an intuition about the scales involved when dealing with a sensor of this complexity. We validated the operation of the system first by simulating the array responses of the HiRIS sensor, and compared these resulting array responses to real-life measurements. Furthermore, we performed an ensonification experiment of a cluttered office environment, and generated B-mode images of the resulting datastreams.
With HiRIS, we have developed a novel sensor system which is a step change in imaging capabilities of in-air sonar sensors, and which will allow virtually artifact-free imaging of real-world scenes. Therefore, we see the HiRIS as a virtual upper limit of in-air sonar imaging: more complex sensors could indeed be implemented, but the industrial relevance of systems of this complexity can be debated. Evidently, the approach we have taken during the development of HiRIS is in stark contrast to the developments of our eRTIS line of sensors [3], [17], during which component cost reduction was the major driving force during development. These sensors have been utilized in real-world applications under industrial constraints [38], [39], [40], [41], [42], which has lead us to the ultimate question: what is the upper limit of ultrasound sensing that can be achieved, given the specular reflection model [2] under which the majority of ultrasound sensors operate. With HiRIS, we take the opposite approach: what is the upper limit that, given unrestricted sensing capabilities, can be achieved with in-air ultrasonic imaging, which in turn should lead to answers about the validity of the specular reflection model, the relative importance of diffraction echoes, and how semantic information about the environment is being translated into the ultrasonic sensing domain.
To conclude, we believe that the HiRIS sensor will allow us to uncover the underlying mechanics of in-air ultrasound sensing in a previously unobtainable level of detail, which will then inform the development of future installments of 3D ultrasound sensors for industrial applications.
In future work, we aim to further quantify the performance of the HiRIS sensor, both in laboratory settings as well as real-world measurements. We will produce high-resolution datasets, which will be made open-source for the sensing community to evaluate and use. Using these datasets it should become possible to quantify how information-rich real-world ultrasound measurements really are, and how this information can be leveraged to provide robots with a rich understanding of their environments using ultrasound as a primary sensing modality. From these measurements, the effect of applying reduced-aperture microphone arrays instead of the large 1024 element array can be accurately calculated, as virtually any reduced aperture can be adequately simulated using the HiRIS array.
ACKNOWLEDGMENT
The authors would like to thank Christoph Haugwitz from TU Darmstadt for his insightful comments on technical aspects of this article.