Speaker-Adaptive Speech Recognition

1355 Words6 Pages

However, one indirect application of identification is speaker-adaptive speech recognition, in which speech from an unknown speaker is matched to the most similar-sounding speaker already trained on the speech recognizer. Other potential identification applications include intelligent answering machines with personalized caller greetings and automatic speaker labeling of recorded meetings for speaker-dependent audio indexing. Verification forms the basis for most speaker-recognition applications. Current applications such as computer log-in, telephone banking, calling cards, and cellular-telephone fraud prevention substitute …show more content…

In identification, the goal is to determine which voice in a known group of voices best matches the speaker. In speaker verification, the goal is to determine if the speaker is who he or she claims to be. The aim of the paper is to use Gaussian mixture speaker model (GMM) to optimize the performance of automatic speaker recognition by reducing the error rate (ERR) between the real claimant and imposter. 1.4 LIMITATION Many challenging problems and limitations remain to be overcome in recognizing speaker or to detect the true claimer from imposter. i. In speaker identification, the unknown voice is assumed to be from the predefined set of known speakers. ii. The difficulty of identification generally increases as the speaker set (or speaker population) increases. Applications of pure identification are generally unlikely in real situations because they involve only speakers known to the system, called enrolled speakers. 2 LITERATURE …show more content…

In 1976, Texas Instruments built a prototype system that was tested by the U.S. Air Force and The MITRE Corporation. In the mid 1980s, the National Institute of Standards and Technology (NIST) developed the NIST Speech Group to study and promote the use of speech processing techniques. Since 1996, under funding from the National Security Agency, the NIST Speech Group has hosted yearly evaluations, the NIST Speaker Recognition Evaluation Workshop, to foster the continued advancement of the speaker recognition community. 3 METHODOLOGY Most automatic speaker-recognition systems rely upon spectral differences to discriminate speakers. Natural speech is not simply a concatenation of sounds. Instead, it is a blending of different sounds, often with no distinct boundaries between transitions. Fig 2 Schematic diagram of vector feature extraction To obtain steady-state measurements of the spectra from continuous speech, we perform short-time spectral analysis, which involves several processing steps, as shown in Figure 2 1. The speech is segmented into frames by a 20-msec window progressing at a 10msec frame rate. 2. A speech activity detector is then used to discard silence and noise frames. 3. For text-independent speaker recognition, removing silence and noise frames from the training and testing signals is important in order to

Open Document