Types of voice recognition systems Automatic speech recognition (ASR) is just one example of voice recognition, below are other related examples of voice recognition systems. • Speaker dependent system - systems are trained by the individual who will be using the system. The drawback of approaching this type of system is that the system only responds accurately only to the individual who has trained the system. This is the most common approach used in software for personal computers. • Speaker independent system: - This system is trained to respond to a word regardless of who speaks.
Demerits • Still the best speech recognition applications sometimes make errors. They raise noise or some other sound the number of errors will increase. • Speech recognition works best if the microphone is close to the user.More distant microphones like on table, wall will tend to increase the number of errors. • Speaker
It will also implement signs that can be converted to speech. Thus the application will solve the problem of communication by letting the speech impaired people make sings in front of a web cam and thus produces a resultant voice output. Sign recognition is a typical applies of image understanding as it involves capturing, detecting and recognizing of hand signs. A functioning sign language recognition system could provide an opportunity for the speech impaired to communicate with non-signing people without the need for an interpreter. It could be used to generate speech or text making the speech impaired more independent .
This fact has motivated researchers to think of speech as a fast and efficient method of interaction between human and machine. However, this requires that the machine should have the sufficient intelligence to recog-nize human voices. Since the late fifties, there has been tremendous research on speech recognition, which refers to the process of converting the human speech into a sequence of words. However, despite the great progress made in speech recognition, wearestill far from having a natural interaction between man and machine because the machine does not understand the emotional state of the speaker. This has introduced a relatively recent research field, namely speech emotion recognition, which is defined as extracting the emotional state of a speaker from his or her speech.
Hence this system converts the sign language into an understandable form for the normal people and detects the hand motions in few seconds. It has got an accuracy of 99%. fig 3: Hardware setup of the system. V. FUTURE ENHANCEMENT The overall system recognises only a partial sign language in which the gestures are based on the finger movements. Facial expressions is also a part of communication.The future work can be extended to recognise more words and also to capture the facial expressions by employing more flex sensors and MEMS
These models include both linguistic and non-linguistic components of L2 listening ability, and attempt to synthesize different sub-skills and sub-components of L2 listening into a single construct. To apply the knowledge, Rost (2013), Goh (2000), Wilson (2003), Vandergrift (2007), Prince (2012) divides into bottom-up and top-down processes in listening comprehension. Moreover, Rost (2013) describes that model of L2 listening ability should consist the components: phonological knowledge, syntactic knowledge, semantic knowledge, pragmatic knowledge and general knowledge. The terms of knowledge refer to components of listening ability. According to Rost, the use of syntactic, semantic and pragmatic knowledge quickly covers the bottom-level (skill-specific) ability of L2 listening comprehension.
The concept of context awareness and micro-environment sensing can be used to develop many applications based on inbuilt sensors which will be able to simulate the higher level applications of Smartphones. Automatic call acceptance It comes under phone interaction detection category. There are some situations in our daily routine when we are not able to pick phone because we need to swipe to pick up a call e.g. stuck in traffic, at railway station, markets etc. In such situations it is possible to pick the call automatically with the help of position of phone with respect to user using proximity sensor.
Speaker identification is the process of determining which registered speaker provides a given utterance. Each block of the speaker recognition system can be described as below; • Input Speech: Input speech is the signal given by the speaker to the above system. Normally the human speech is the pure analogue signal, thus in order to process the signal further the analogue signal has to be converted to the digital signals. This conversion can be done by using the techniques such as sampling and quantization which is known as digital signal processing. But for the above system, the input speech signal is already given in the digital form which can be done by recording the voices of the speaker.
Chapter 2 Human Speech Production and Perception 2.1 Human Speech Production Speech signals are composed of a sequence of sounds. These sounds and the transition between them serve as a symbolic representation of information. The arrangement of sounds (symbols) is governed by the rules of the language. The study of the rules and classification of speech is called phonetics. The purpose of processing speech signals is to enhance and extract information, which is helpful in providing as much knowledge as possible about the signal’s structure i.e., about the way in which information is encoded in the signal.