In this talk, we explore the extent to which it is possible to reduce or remove, from contemporaneous recordings made in the same acoustic environment, interfering music, background noises, and speech of other speakers, to bring the voice of the main speaker to the forefront. We have previously presented the application of acoustic fingerprinting and music signal cancellation for the forensic enhancement of audio containing speech masked by background music [Alexander and Forth, 2011]. In this work, we extend this approach to align and subtract recordings made from two independent audio recorders using landmark-based audio fingerprinting. This has potentially significant applications in covert audio deployments and subsequent enhancement of surveillance audio recordings.
We will thus present a novel application of audio fingerprinting and reference cancellation to forensic audio enhancement. In covertly acquired audio surveillance recordings, it is common to find the speech of interest masked by the speech of other non-target speakers in the room, or obscured by interfering music or television noise, as well as other noises such as banging, clanging or slamming in the acoustic environment. These noises can drown the speech of interest and make it difficult to understand easily or be accurately transcribed. When two recordings of a particular acoustic event are available, two-channel adaptive filtering or reference cancellation is a highly effective tool for enhancement. However, the two recordings have to be painstakingly manually aligned, and this is difficult to do by eye or ear. We present an approach using landmark-based acoustic fingerprinting to identify, automatically align, and subtract reference sounds and bring the speech of interest to the forefront.
Avery Wang (2003). "An Industrial-Strength Audio Search Algorithm", Proc. 2003 ISMIR International Symposium on Music Information Retrieval, Baltimore, MD, Oct. 2003.
J. Benesty, D. Morgan and M. Sondhi (1997). "A better understanding and an improved solution to the problems of stereophonic acoustic echo cancellation", Proc. ICASSP,97, 303.
D. P. W. Ellis (2009). Robust Landmark-Based Audio Fingerprinting. http://labrosa.ee.columbia.edu/matlab/fingerprint/
A. Alexander and O. Forth (2011). "'No, thank you, for the music': An application of audio fingerprinting and automatic music signal cancellation for forensic audio enhancement", International Association of Forensic Phonetics and Acoustics Conference 2011, Vienna, Austria, July 2011.