Semi-supervised end-to-end speech recognition using text-to-speech and autoencoders

Case ID:

C15791

Unmet Need

The speech and voice recognition market is set to reach 25 B at 20% CAGR through 2025. The development of software with low error rate and fast conversion between speech and text has been of great interest. Specifically, there is a need for speech and text encoders for speech only or text only datasets without the need for parallel speech and text data.

Technology Overview

The inventors have developed speech and text autoencoders that share encoders and decoders with an automatic speech recognition (ASR) model to improve ASR performance with large speech only and text only training datasets. The experimental result obtained with their semi-supervised end-to-end ASR training revealed reductions from a model initially trained with a small paired subset of the LibriSpeech corpus in the character error rate from 10.4% to 8.4% and word error rate from 20.6% to 18.0% by retraining the model with a large unpaired subset of the corpus.

Stage of Development

Proof of concept testing has been completed.

Patent Information:

Title	App Type	Country	Serial No.	Patent No.	File Date	Issued Date	Expire Date	Patent Status
TRAINING APPARATUS, TRAINING METHOD, AND TRAINING PROGRAM	ORD: Ordinary Utility	Japan	2019-159952		9/2/2019			Pending

Direct Link:

https://jhu.technologypublisher.com/technology/38587

Inventors:

Category(s):

Technology Classifications > Computers, Electronics & Software > Algorithms, Technology Classifications > Computers, Electronics & Software > Artificial Intelligence, Technology Classifications > Computers, Electronics & Software > Machine Learning,

Get custom alerts for techs in these categories/from these inventors:

Subscribe for JHTV Updates

For Information, Contact:

Lisa Schwier

lschwie2@jhu.edu

410-614-0300

Save This Technology:

Bookmark this page

Download as PDF