Speech Synthesis Abstract

966 Words4 Pages

Abstract In today’s world, a lot of problems are being faced by many people due to disability in sense organs such as low eye sight, hearing issues, and many other problems. There are nearly about 161 million visually impaired and 37 million blind people worldwide. Many times they are uncomfortable in a new environment because of problem of communication and access to information. It becomes tedious for blind people to read and walk to know the shop around due to visual impairment. Hence objective of the proposed system is to detect, extract and recognize text from images and convert them from text to speech to assist visually impaired or blind persons to help them carry out their day to day activities and make them self-governing An assistive …show more content…

The methodology used in TTS is to exploit acoustic representations of speech for synthesis, together with linguistic analysis of text to extract correct pronunciations (‘‘content’’; what is being said) and prosody in context (‘‘melody’’ of a sentence; how it is being said). Speech synthesis system can be divided into two parts i) Front End also called Natural Language Processing module (NLP) [12] used to analyze text, and ii) Back End also called Signal-processing module that generates the speech waveform based on information from the front end. Front end contains: text processor (normalization and letter-to-sound), prosody control, unit selection [13]. So it is basically concerned with the conversion of grapheme- to-phoneme. This process is also called ‘‘letter-to-sound’’ conversion. Back End is concerned with technique used for synthesis. There are two techniques [14]: format synthesis [15, 16,17] and concatenative synthesis [18,19,20].Format synthesis depends on acoustical models in order to produce parametric driven speech, while concatenative synthesis, concatenates segments of recorded speech. Format synthesis can be highly intelligible, but due to the difficult and complex task of obtaining good enough speech models, the synthesized speech has so far a degraded speech quality to some extent. Whereas Concatenative synthesis can be very natural in the sense of having a speech quality close to human speech, but it may suffer from audible discontinuities at concatenation

More about Speech Synthesis Abstract

Open Document