Hackathon project for the GT Center for Music Technology x Synthux Hackathon, 2024
From the outside, the Auditory Mirror is an 8-foot tall by 4-foot wide box, draped in black fabric. It emanates an ominous ambience, beckoning the audience to step inside. Once inside, the user is enveloped in a rich soundscape and complete darkness. Although they can’t see anything, the Auditory Mirror can see them. As the user moves their hands and body, the sounds change and warp, responding to actions that the user cannot see themselves. The Auditory Mirror examines how we use our senses to interact with the world, and how that interaction informs our own perception of self.
The skeleton of the Auditory Mirror is a wooden frame with one detachable wall to ensure it can be moved. There are 8 speakers mounted in each of the vertices surrounding the user’s head for a fully immersive auditory experience. The detachable wall houses an array of 18 ultrasonic sensors – the “mirror”. The sensor data is received by an ESP-32 microcontroller which triangulates the position and velocity of the user’s hand. This data is sent via UART to a Raspberry Pi running Pure Data. Pure Data is responsible for the sound generation, which responds to motion from the user, changing over time. Using JACK, PD outputs 8 individual audio streams to four Teensy microcontrollers with the Teensy Audio Shield, which act as USB audio interfaces to drive the speakers.
* * *
The ultrasonic sensors, which facilitate the interactive component of the mirror, work by bouncing ultrasonic pressure waves off of the user and using the delay on the returned waves to estimate distance. Each sensor receives a trigger pulse, after which they return a pulse whose width corresponds to the distance to the nearest object in front of its sound beam.
To use the sensors to calculate position rather than just distance, I started by using just two sensors to triangulate the position of an object between them. Each sensor is only accurate within 30 degrees of its center axis, so the distance between them had to be carefully calculated based on the predicted closest approach of the object. With two sensors it was a simple task, but generalizing the approach to 3 posed some challenges. Three dimensional triangulation using the Pythagorean theorem was too computationally intensive and complicated to be feasible. Instead, I used a more numeric approach. I made an array of sensors, each having its own dynamically updated distance field and x and y position in the physical sensor array. Then, I took an average of each sensor's position, weighted by its distance, to closely calculate the true location of an object. This approach worked quite well, and produced accurate results for an array of 4 sensors.
Generalizing the approach to 18 sensors seemed like a straightforward task, but in practice it brought about new challenges. With 18 sensors there was a great deal of interference caused by the simultaneous emission of sound waves, causing erratic and erroneous values to be reported. One possible solution for this would be to asynchronously trigger the sensors so they fire in sequence rather than at the same time, but such an approach would risk sacrificing the time resolution of the array, since the sensors would not be able to update position instantaneously. Instead, I implemented an averaging filter on the sensor outputs to filter out the high frequency jumps.
With the sensor output working consistently, the next step in the chain was to get this data to the raspberry pi, which was handling audio generation. This was done via a UART serial connection. Serial data in the form of formatted strings was sent from the ESP-32, which interpreted the sensor data, to the raspberry pi, which received the position data in a Pure Data patch. The Pure Data patch used this data to generate the soundscape for the piece.
Next, the audio from PD had to be sent to the 8 speakers surrounding the user. This was accomplished by using 4 teensy microcontrollers as USB audio devices, which connected to the raspberry pi through Jack. Each of these output a stereo audio stream, which was amplified and sent to two speakers.
This audio solution was tenuous -- we suffered from DAC timing synchronization issues which led to popping and clicking in the audio playback. To mitigate this we swapped out the raspberry pi for a Bela board, which has 8 analog outputs, removing the need for the teensy's. The problem then was getting the massive PD patch compiled into C and successfully running on the Bela, as it exceeded 100% of CPU and RAM usage at first.
For our most recent installation of the mirror we opted to use a windows laptop and audio interface to handle audio generation, which makes the system less self-contained but overall delivers a better audience experience.