Example of the system working with three microphones and one signal in a noise-free environment. Real DOA is shown with a green circle, at around 60 degrees.

Histogram of predicted DOA. In most cases, it is remarkably near the real value.

I created a system that is able to discern the direction of arrival for real-time audio signals by using the JACK Audio Connection Kit and the Numba JIT compiler in order to achieve real-time Python audio processing. This implementation matches the performance of JACK clients written in C with libraries such as Eigen and FFTW3 while also maintaining Python's ecosystem, interoperability and readability.

The system implements a modified version of the MUSIC direction of arrival estimation algorithm. Instead of its traditional implementation, I reduce the results of scanning several frequencies (those with the highest signal-to-noise ratio) in order to find the most likely DOA for a series of complex audio signals. 

While noisy environments impact the system's performance, its accuracy can still be considered acceptable. Development and testing were performed based on the IIMAS's AIRA corpus for robotic audition.

Example of the system working with three microphones and one signal in a noisy and reverberant environment. Real DOA is shown with a green circle, at around 0 degrees.

Histogram of predicted DOA. The most frequent predictions are still around the real value.

While the MUSIC algorithm requires prior knowledge of the number of signals of interest, estimating this in real time is still an active area of research in audio processing. Because of this, when two signals are present, the system usually detects the DOA of a single dominant signal.

Example of the system working with three microphones and two signals in a noise-free environment. Real DOAs are shown with green circles, at around -30 and 90 degrees. As it can be seen, the system usually alternates between one of the two real DOAs for this recording.

Histogram of predicted DOAs. The most frequent predictions are around the two real values.

Even with its current shortcomings as a proof-of-concept, the system is able to produce useable results for many different scenarios.  Both the noisiness in its outputs and the display of multiple signals could easily be overcome with methods such as particle or Kalman filters, which can be seen as post-filtering rather than a more direct modification of the system.
The source code for this project can be found here.

You may also like

Back to Top