Nonlinear Statistical Language Modeling

Stay connected



Share on facebook
Share on twitter
Share on linkedin

CIS Colloquium, Sep 02, 2009, 01:00PM – 02:00PM, Wachman 447

Nonlinear Statistical Language Modeling

Joseph Picone, Chair, Temple ECE

Statistical or machine-learning techniques, such as Hidden Markov models and Gaussian mixture models, have dominated the signal processing and pattern recognition literature for the past 25 years. However, such approaches are prone to overfitting and have problems with generalization. For example, delivering high performance on previously unseen noise conditions remains an elusive goal.

In this presentation, we will review our recent work on applying principles of nonlinear statistical modeling to acoustic modeling in speech recognition. Our goal is to improve recognition performance in noisy environments. We will discuss the use of an extended feature vector containing features based on correlation dimension, correlation entropy and Lyapunov exponents. We will also introduce a new acoustic model based on a probabilistic mixture of autoregressive models.

Experimental results are presented on the Aurora IV large vocabulary speech recognition task in which audio data from a variety of actual noise conditions were digitally added to the standard Wall Street Journal 5K closed-vocabulary task. We will show modest gains in performance can be achieved under matched conditions, but performance degraded under mismatched training conditions.

Joseph Picone received his Ph.D. in Electrical Engineering in 1983 from the Illinois Institute of Technology. He is currently a Professor and Chair of the Department of Electrical and Computer Engineering at Temple University. His primary research interests are currently machine learning approaches to acoustic modeling in speech recognition. For over 25 years he has conducted research on many aspects of digital speech and signal processing. He has also been a long-term advocate of open source technology, delivering one of the first state-of-the-art open source speech recognition systems, and maintaining one of the more comprehensive web sites related to signal processing.