wav2vec2 for Automatic Speech Recognition In Plain English

Too Long; Didn't Read

wav2vec2 is a leading machine-learning model for the design of automatic speech recognition (ASR) systems. It is composed of three general components: a Feature Encoder, a Quantization Module, and a Transformer. The model is pretrained on audio-only data to learn basic speech units. The model is then finetuned on labeled data where speech units are mapped to text.

featured image - wav2vec2 for Automatic Speech Recognition In Plain English

Picture in the Noise HackerNoon profile picture

@pictureinthenoise

Picture in the Noise

Speech and language processing. At the end of the beginning.

Receive Stories from @pictureinthenoise

react to story with heart

Too Long; Didn't Read

@pictureinthenoise

RELATED STORIES