Devanagari Complex Characters

1034 Words5 Pages
Devanagari script is complex in several ways. This has two-dimensional compositions of symbols viz. core characters in the middle strip, optional modifiers above and/or below core characters. Most of the characters are formed by curves, holes, and strokes. Apart from 13 vowels and 36 consonants characters which are called basic characters, there are compound characters in Devanagari script, which are formed by combining two or more basic characters. Theoretically there could be 46656 i.e. 36x36x36 triconsonantal conjunct characters. The shape of compound character is usually more complex than the constituent basic characters. The shape of these characters changes drastically with fonts. Separating the conjunct characters into its constituent symbols leads to segmentation errors. 10.1 How to manage Complexities In nut shell, the complexities which are inherent in middle zone of Devanagari script can be managed by way of • Reducing the character set to frequently used characters instead of theoretically 46656 conjunct characters. • Identifying the peculiarities or pattern of character occurrences. • Classifying these characters in small manageable classes based on such properties which are invariant to the fonts. By this proposed approach, the complexity in recognition of unknown character symbol is…show more content…
The frequency analysis done for two documents with different contents and sizes shows that 97% text is covered by the single characters and the presence of the conjunct characters in these documents is around 3% only which is very less in comparison to the occurrences of single characters. This is despite the fact that the numbers of possible conjunct symbols are much higher than the possible number of single characters. The overall coverage by these identified 345 symbols is found to be 99.97% on 13113 words in 5

More about Devanagari Complex Characters

Open Document