Added blueiq post
After Width: | Height: | Size: 272 KiB |
After Width: | Height: | Size: 315 KiB |
After Width: | Height: | Size: 515 KiB |
After Width: | Height: | Size: 680 KiB |
After Width: | Height: | Size: 1.9 MiB |
After Width: | Height: | Size: 238 KiB |
After Width: | Height: | Size: 186 KiB |
After Width: | Height: | Size: 48 KiB |
After Width: | Height: | Size: 1.3 MiB |
|
@ -0,0 +1,134 @@
|
|||
+++
|
||||
title = 'USM / BlueIQ - AI Sonar Buoys'
|
||||
date = 2025-03-17T14:00:00-05:00
|
||||
draft = false
|
||||
categories = ['USM', 'BlueIQ', "Acoustics"]
|
||||
tags = ['data science', 'KarstTech', 'sonar', 'TDOA']
|
||||
+++
|
||||
|
||||
{{< katex >}}
|
||||
|
||||
# Summary of "Enhancing Maritime Domain Awareness Through AI-Enabled Acoustic Buoys"
|
||||
|
||||
This paper demonstrates how AI-enabled acoustic buoys can significantly enhance maritime domain awareness through real-time detection and tracking of vessels, particularly fast-moving boats. The research was conducted by a team from the University of Southern Mississippi's Roger F. Wicker Center for Ocean Enterprise in collaboration with BLUEiQ. I was the lead author on the paper, though it was very much a team effort. My focus was primarily on the development of the off-buoy data processing and target localization techniques. This work is covered in more detail in the published paper which can be found [here](https://www.mdpi.com/3232664). The paper also includes a link to [source code](https://github.com/jeremykarst/acoustic_multilateration) for the TDOA localization algorithms discussed.
|
||||
|
||||
## Solving a Maritime Domain Awareness Problem
|
||||
|
||||
There are endless reasons why you might want to detect and track vessels near the shoreline, whether it comes to collision avoidance in poor weather, monitoring for illegal activities, or even just keeping track of commercial shipping traffic. In many countries vessels are required to have a transponder which helps to locate them, and radios which they can use to send and receive communications, but just because these are required doesn't mean they are always used. Trackign vessels by independent means allows correlation with transponded data which can make vessels with disabled transponders trivial to find. Locating them acoustically allows cost effective sensing, and provides advantages over traditional sensing like cameras or radar.
|
||||
|
||||
The research team deployed three OpenEar™ sensors in a triangular formation off the coast of Gulfport, MS. These low-power, AI-enabled acoustic sensing platforms were designed for autonomous maritime surveillance with several distinguishing features:
|
||||
|
||||
- Hydrophones with sensitivity of -160 dB re 1 V/µPa and frequency response range of 10 Hz to 200 kHz
|
||||
- Low power consumption supported by solar and battery for extended deployment
|
||||
- Onboard AI processing that consumes only 2 watts of power
|
||||
- Edge-based machine learning for vessel detection and classification
|
||||
- Very low deployment cost
|
||||
|
||||
")
|
||||
|
||||
## The Challenge
|
||||
|
||||
To allow the sensor hardware to be very low cost and lower power consumption, the team had to develop new algorithms for vessel detection and localization. The goal was to create a system that could detect vessels in real-time, even in difficult acoustic environments like those near the shoreline where multipath noise, and environmental conditions like snapping shrimp can cause problems for traditional sonar systems. These systems need to filter through a vast amount of data and only transmit the most important information back to a central processor to preserve power and bandwidth costs.
|
||||
|
||||

|
||||
|
||||
### AI Model Development
|
||||
|
||||
I will only briefly cover the AI model development here because I was only deeply involved in the off-buoy localization portion of the project. This AI model was crucial to the success of the project because it allowed initial filtering of the acoustic data by recognizing vessel sound signatures and triggering sending recorded data to a central location for further processing.
|
||||
|
||||
The vessel detection AI model was developed using:
|
||||
- A lightweight convolutional neural network optimized for low-power edge hardware
|
||||
- Edge Impulse platform for training and optimization
|
||||
- Mel-Frequency Cepstral Coefficients (MFCCs) as input features
|
||||
- Nearly six hours of recorded data, including various high-speed boats
|
||||
|
||||
The model achieved impressive performance metrics:
|
||||
- 98% precision, recall, and F1 score
|
||||
- Real-time inference capability with just 4ms latency
|
||||
- Memory requirement of less than 130K RAM
|
||||
|
||||
## Localization
|
||||
|
||||
Once data was recieved from the buoys, Time Difference of Arrival (TDOA) was used to localize the vessel. This is a family of techniques that uses the difference in arrival times of a signal at two or more sensors to determine the location of the source. The trick to getting good results is measuring the time difference in a way that is robust to noise and works for a wide range of sources.
|
||||
|
||||
I will discuss how the filtering and TDOA measurement works in a moment, but first let's look at the basic idea of TDOA. If we assume that our sensors and our target are confined to a flat plane, then we can use the following equations to describe the relationship between the target and the sensors:
|
||||
|
||||
$$
|
||||
D_0 = \sqrt{(x-M_{0.x})^2 + (y-M_{0.y})^2}
|
||||
$$
|
||||
$$
|
||||
D_1 = \sqrt{(x-M_{1.x})^2 + (y-M_{1.y})^2}
|
||||
$$
|
||||
$$
|
||||
D_2 = \sqrt{(x-M_{2.x})^2 + (y-M_{2.y})^2}
|
||||
$$
|
||||
$$
|
||||
D_0 - D_1 = c \cdot \Delta t_{01}
|
||||
$$
|
||||
$$
|
||||
D_1 - D_2 = c \cdot \Delta t_{12}
|
||||
$$
|
||||
|
||||
- \\( x, y \\): Estimated target location in local coordinates
|
||||
- \\( M_{i.x}, M_{i.y} \\): Microphone \\( i \\) Position for microphone index \\( i \\)
|
||||
- \\( \Delta t_{ij} \\): Time difference of arrival between microphones \\( i \\) and \\( j \\)
|
||||
- \\( c \\): Speed of sound in water
|
||||
|
||||
These equations imply a set of hyperbolas, one for each pair of microphones, and our target lies at the intersection of these hyperbolas (in theory)... There are a few problems with this. For one, the system of equations is overdetermined, meaning that there are more equations than unknowns. This means that we can't actually solve for the target location directly, and the visual representation of this problem is that the hyperbolas don't nessessarily intersect at a single point. Additionally, there are TDOAs that correspond to targets that are infinitely far away and would not have been detected in the first place.
|
||||
|
||||

|
||||
|
||||
### Solving a Messy Problem
|
||||
|
||||
Before we can start addressing the problems of overdetermined equations, we need to first acutally measure our time differences of arrival! This involves picking out the target signal in the recorded data, and lining it up with a similar signal in the other recordings. This is a bit of an art, and there are a few different ways to do it.
|
||||
|
||||
Here is a spectrogram of one test run with a vessel passing by. Each row of subplots corresponds to a different microphone. The first column is the waveform of the signal, which is only included to show how different the same noise source appears in each recording. The middle column is a full bandwidth spectrogram, and the central green / yellow region on this spectrogram corresponds to the vessel passing close to the microphone array. The right column is a "zoomed-in" low frequency spectrogram of the same signals, and the horizontal red lines show the characteristic engine noise of the vessel.
|
||||
|
||||

|
||||
|
||||
It is important to line these signals up as accurately as possible, because even a small misalignment of a few samples can cause huge positional errors in the TDOA. If our signals were narrowband at a well defined frequency, we could use things like doppler shift, and bandpass filters to fight against the noise and get more accurate measurements, but our signals are broadband at an unknown and changing frequency and changinging over time if engine RPM changes. If our microphones were very close to each other, we could use things like the MUSIC algorithm, or measurement of phase differences between microphones to get an accurate TDOA, but since our microphones are far apart, they are well beyond the distance where our audio signals are coherent (where the crests of the wave at lower frequencies would arrive nearly in phase at each microphone). One of the few techniques that we do have out our disposal is the Generalized Cross-Correlation (GCC) family of algorithms, which are a type of TDOA algorithm that are robust to noise and can operate on broadband signals.
|
||||
|
||||
The GCC-PHAT algorithm tranforms our time domain signal into an array of time delays and frequency shifts, then uses the cross-correlation of these signals to find the maximum-likelihood time delay. In essence, this is finding out how similar the signals are at each different frequency bin over time, making frequencies that are too high to be coherent still useful to us by correlating their long time-scale changes which happen over time periods much longer than the equivalent time separation of the microphones at the speed of sound.
|
||||
|
||||
In the plot below, we have three correlation subplots, one for each microphone pair. The orange dashed vertical lines represent the maximum possible time delay between the two signals based on the distance between the microphones and the speed of sound. We only look for a maximum correlation within this range. The red X corresponds to this maximum likely TDOA, but we also estimate the noise floor of the correlation, and the height of the second highest peak to estimate the uncertainty of our TDOA. The higher the main peak, and the lower the second highest peak relative to the noise floor, the more confident we are that our TDOA is accurate.
|
||||
|
||||

|
||||
|
||||
### Filtering the Results
|
||||
|
||||
Now that we have some measured TDOAs, we can use them to try to locate targets! Lets have a look at the TDOAs we calculated from one of the test runs as the vessel moves over time. Here we use half second windows with a 0.5 second overlap for each point. In the plot below we draw solid colored lines for the known TDOAs calculated from GPS ground truth, and the scatter points show the TDOAs we calculated from the recorded audio. Early on in the run when the vessel is far away, the TDOAs are very noisy, and not near the ground truth. As the vessel gets closer, the TDOAs converge to the ground truth, though still fuzzy. When the ground truth signals suddenly change is where the vessel passes between the buoys, and at this point the TDOAs once again become noisy due to both the sudden change, and more importantly because the broadband noise of the vessel gets very strong, random, and hard to correlate here.
|
||||
|
||||
We need a strategy to filter out these noisy TDOAs and only keep the good ones. We also need to decide when the TDOAs are good enough to use for localization and when they are too random or low confidence to use.
|
||||
|
||||

|
||||
|
||||
For filtering, we use a combination of confidence and statistical outlier metrics that is called "RobustCovariance" in the source code. This uses something called the [mahalanobis distance](https://en.wikipedia.org/wiki/Mahalanobis_distance) to determine if a point is an outlier, which is a measure of how far a point is from the mean of the distribution in units of standard deviations.
|
||||
|
||||
After filtering, the TDOAs look like this:
|
||||
|
||||

|
||||
|
||||
### Solving the Overdetermined Problem
|
||||
|
||||
I won't go into great detail here about how the TDOAs are used to get locations, especially because the [source code](https://github.com/jeremykarst/acoustic_multilateration) is shared! We use a least squares based method with an out-of-the-box optimizer from the scipy library, which we augment with some possible initial guesses to help it converge.
|
||||
|
||||
Here is the initial localization result:
|
||||
|
||||

|
||||
|
||||
From the localization implied by the previous TDOAs plot, we can see that same behavior of noisy data early on when the vessel is far away, and then the TDOAs converge to the ground truth as the vessel gets closer. By instead representing the localization as angle and distance predictions, we can see that the angle predictions are much more closely grouped around the ground truth, and the distance predictions are less accurate especially at longer distances. This is an important observation, and should be intuitive based on the geometry of the implied hyperbolas. As a vessel gets far away, it's position corresponds to a smaller and smaller angular difference from the microphone array, and so a small angular error can correspond to a HUGE distance error.
|
||||
|
||||
We can also see that our TDOA confidence metric is useful, if noisier than we would like. By filtering it we can find a region of higher confidence TDOAs which means we can use a smoothed confidence threshold to determine if and where the data is trustworthy.
|
||||
|
||||
### More Filtering!
|
||||
|
||||
We can use a custom [Kalman Filter](https://en.wikipedia.org/wiki/Kalman_filter) to use knowledge about how the vessel is capable of moving (inertia and speed limits), and the fact that we know the angular data is more trustworthy than the distance data to filter through the noise and get a much better estimate of target trajectory.
|
||||
|
||||

|
||||
|
||||
|
||||
## Conclusion
|
||||
|
||||
While it is still far from perfect, this is a very usable data source and method for acoustically tracking vessels from a low-cost and low-power platform!
|
||||
|
||||
There are a number of improvements that could still be made to the filtering stages, but since BlueIQ is working on a new generation of buoys, that work will wait until our next round of testing.
|