An Event-Based Cochlear Filter Temporal Encoding Scheme for Speech Signals

An Event-Based Cochlear Filter Temporal Encoding Scheme for Speech Signals
Title:
An Event-Based Cochlear Filter Temporal Encoding Scheme for Speech Signals
Other Titles:
The International Joint Conference on Neural Networks (IJCNN)
Publication Date:
08 July 2018
Citation:
Abstract:
Spiking Neural Network (SNN), the third generation of neural networks, has been shown to perform well in pattern recognition tasks involving temporal information, such as speech recognition and motion detection. However, most neural networks, including the SNN, for speech recognition rely on short-time frequency analysis, such as the mel-frequency cepstral coefficients (MFCC), for low-level feature extraction. MFCC feature extraction works by analyzing a window of time signal in multiple frequency bands one window at a time, in a synchronous fashion. This is in contrast to the event-based principle of SNN, whereby electrical impulses are emitted and processed in an asynchronous fashion. Just as speech signals arrive at the human's cochlear filterbank concurrently, but spikes encoding the power in each frequency band are emitted asynchronously, we propose an event-based cochlear filter encoding scheme, whereby the power in each frequency band is directly extracted in the time domain and spikes encoded using the latency code are emitted asynchronously to represent the power of each frequency band. This replaces the traditional MFCC frontend used in most speech recognition models, and makes possible an end-to- end event-based SNN implementation for a speech recognition task. The proposed event-based neural encoding is not only biologically plausible, but also outperforms the MFCC as an encoding frontend for an SNN classifier in a speech recognition task, in terms of higher classification accuracy and lower latency. Such an end-to-end SNN model could be implemented on a neuromorphic chip to fully realize the advantages of event-based processing.
License type:
Funding Info:
This research is supported by Programmatic grant no. A1687b0033 from the Singapore governments Research, Innovation and Enterprise 2020 plan (Advanced Manufacturing and Engineering domain).
Description:
ISBN:

Files uploaded:
File Size Format Action
There are no attached files.