Stemming Algorithm Essay

812 Words4 Pages
A Literature Review: Stemming Algorithms for Hindi Language Lalit Kumar M.Tech : Department Of C.S.E B.T.K.I.T Dwarahat, Almora, India Abstract - Stemming is a technique used for extracting root word from the given inflection word. Stemming algorithms comes under the preprocessing step in text mining application and plays significant role in numerous application of Natural Language Processing (NLP). Stemming is also used by web search engines for prefix and suffix removal from the derived word. Stemming provides the way to store similar documents together. The main purpose of stemming is to reduce different grammatical forms / word forms of a word like its noun, adjective, verb, adverb etc. to its root form. This expository paper presents survey of some stemming algorithm for Hindi language,…show more content…
Related Work The Performance of Stemming algorithms is depends upon two things, First is based on results which is produced by the stemmer i.e. light weight stemmer [ ], rule based stemmer [ ] and second is based on resource which is used by stemming algorithm i.e. corpus based algorithm [1], dictionary based algorithm [ ] etc. The very first paper published on stemmer in based on rule based approach which is given by Jolie Lovins in the year 1968. After this researchers started to investigate different-different techniques to extract the root word from a given word. In the same sequence Ananthakrishnan Ramanathan and Durgesh D. Rao published a paper on light weight stemmer in Hindi [ ], this approach is based on a predefine datasets of suffixes which is also developed by authors. Another stemmer is developed by Vishal Gupta published as ‘A rule based stemmer for nouns’ [ ]. This stemmer use set of rules for stemming. Hybrid approach is used by Upendra Mishra and Chandra Prakash in their stemmer named MAULIK [ ]. This hybrid approach is nothing but a combination of brute force approach and suffix removal approach. 2. Stemming

More about Stemming Algorithm Essay

Open Document