SciELO - Scientific Electronic Library Online

 
vol.24 número3Ground Truth Spanish Automatic Extractive Text Summarization BoundsAutonomous Drone Racing with an Opponent: A First Approach índice de autoresíndice de assuntospesquisa de artigos
Home Pagelista alfabética de periódicos  

Serviços Personalizados

Journal

Artigo

Indicadores

Links relacionados

  • Não possue artigos similaresSimilares em SciELO

Compartilhar


Computación y Sistemas

versão On-line ISSN 2007-9737versão impressa ISSN 1405-5546

Resumo

MALDONADO, Alejandro; RASCON, Caleb  e  VELEZ, Ivette. Lightweight Online Separation of the Sound Source of Interest through BLSTM-Based Binary Masking. Comp. y Sist. [online]. 2020, vol.24, n.3, pp.1257-1270.  Epub 09-Jun-2021. ISSN 2007-9737.  https://doi.org/10.13053/cys-24-3-3485.

Online audio source separation has been an important part of auditory scene analysis and robot audition. The main type of technique to carry this out, because of its online capabilities, has been spatial filtering (or beamforming), where it is assumed that the location (mainly, the direction of arrival; DOA) of the source of interest (SOI) is known. However, these techniques suffer from considerable interference leakage in the final result. In this paper, we propose a two step technique: 1) a phase-based beamformer that provides, in addition to the estimation of the SOI, an estimation of the cumulative environmental interference; and 2) a BLSTM-based TF binary masking stage that calculates a binary mask that aims to separate the SOI from the cumulative environmental interference. In our tests, this technique provides a signal-to-interference ratio (SIR) above 20 dB with simulated data. Because of the nature of the beamformer outputs, the label permutation problem is handled from the beginning. This makes the proposed solution a lightweight alternative that requires considerably less computational resources (almost an order of magnitude) compared to current deep-learning based techniques, while providing a comparable SIR performance.

Palavras-chave : Beamforming; BLSTM; permutation problem; binary mask.

        · texto em Inglês     · Inglês ( pdf )