A dual-shutter vibration-sensing system developed by Carnegie Mellion University (CMU) researchers uses standard, ordinary cameras to see sound vibrations with such precision that it can reconstruct the music made by a single instrument out of a band.
The researchers say that even the best and most high-powered microphones can’t eliminate nearby sounds or ambient noises that might interfere with the audio that they are being deployed to record. To get around this problem, the researchers decided to approach it from a different angle: sight versus sound.
As described by CMU, the system uses two cameras and a laser to sense high-speed, low amplitude surface vibrations that are then used to reconstruct sound in a way that allows them to capture isolated audio without interference or even a microphone.
“We’ve invented a new way to see sound,” said Mark Sheinin, a post-doctoral research associate at the Illumination and Imaging Laboratory (ILIM) at the CMU Robotics Institute. “It’s a new type of camera system, a new imaging device, that is able to see something invisible to the naked eye.”
The team has been able to show the system’s effectiveness through successful demonstrations that use the cameras to sense vibrations. To do this, the team captured isolated audio of separate guitars playing at the same time and individual speakers playing different music simultaneously.
They then analyzed the vibrations of a tuning fork and used the vibrations of a bag of Doritos near a speaker to capture the sound. The researchers say this unusual method for analyzing sound pays tribute to the work done by researchers who developed one of the first visual microphones in 2014 which used an algorithm to recover speech from the vibrations of a bag of potato chips.
“We’ve made the optical microphone much more practical and usable,” said Srinivasa Narasimhan, a professor at RI and head of ILIM. “We’ve made the quality better while bringing the cost down.”
The research team says that their system dramatically improves on previous attempts at computer vision audio capture because it uses ordinary consumer cameras that cost a fraction of what the high-speed cameras used by previous researchers.
The system behind the cameras works by analyzing differences in “speckle patterns” form images captured with a rolling shutter and a global shutter and an algorithm computes the difference in the batters from the two streams and converts those differences into vibrations, which reconstruct sound.
“A speckle pattern refers to the way coherent light behaves in space after it is reflected off a rough surface. The team creates the speckle pattern by aiming a laser at the surface of the object producing the vibrations, like the body of a guitar. That speckle pattern changes as the surface vibrates, ”CMU explains. A rolling shutter captures an image by rapidly scanning it, usually from top to bottom, producing the image by stacking one row of pixels on top of another. A global shutter captures an image in a single instance all at once. ”
The full research paper, titled Dual-Shutter Optical Vibration Sensing, can be read online and CMU has set up a web page that plays sounds reconstructed by the system.
“If your car starts to make a weird sound, you know it’s time to have it looked at,” Sheinin says. “Now imagine a factory floor full of machines. Our system allows you to monitor the health of each one by sensing their vibrations with a single stationary camera. ”
Image credits: Featured image licensed via Depositphotos.