Summarization In Text Preparation

1502 Words7 Pages

A summary can be defined as a text that is produced from one or more texts, which contain important of the information in the original text(s), and that is no longer than half of the original text(s). 3.1 TEXT SUMMARIZATION: Text summarization or rather automatic text summarization corresponds to the process in which a computer creates a shorter version of the original text (or a collection of texts) still preserving most of the information present in the original text. This process can be seen as compression and it necessarily suffers from information loss. Thus a TS system must identify important parts and preserve them. What is important can depend upon the user needs or the purpose of the summary. 3.1.1 APPLICATION OF TEXT SUMMARIZATION: …show more content…

from the original document and concatenating them into a shorter form. The importance of sentences is decided based on statistical and linguistic features of sentences. Extraction techniques merely copy the information deemed most important by the system to the summary (for example, key clauses, sentences or paragraphs). An abstractive summarization method consists of understanding the original text and re-telling it in fewer words. It uses linguistic methods to examine and interpret the text and then to find the new concepts and expressions to best describe it by generating a new shorter text that conveys the most important information from the original text document. Abstraction can condense a text more strongly than extraction, but the programs that can do this are harder to develop as they require the use of natural language generation technology. 3.2.2 SINGLE AND MULTI DOCUMENT TEXT SUMMARIZATION: If summarization is performed for a single text document then it is called as the single document text summarization. Single document summarization techniques have the potential to simplify information consumption on mobile phones by presenting only the most relevant information contained in the document. If the summary is to be created for multiple text documents then it is called as the multi document text summarization technique. Multi-document summarization creates information reports …show more content…

So, it can be reasonable that n% sentences are chosen from beginning of the text e.g. selecting the first sentence of each document, then the second sentence of each, etc. until the desired summary is constructed. This method is called LEAD based method for summarization. In this technique we assign a score of 1/n to each sentence, where n is the sentence number in the corresponding document file. This means that the first sentence in each document will have the same scores; the second sentence in each document will have the same scores, etc. It also provides a threshold value for the sentence's length. The sentences with lengths less than the specified value are thrown out. 3.3.3 MEAD BASED TECHNIQUE: MEAD [7] is a centroid-based extractive summarizer that scores sentences based on sentence-level and inter-sentence features which indicates the quality of the sentence as a summary sentence. It then chooses the top-ranked sentences for inclusion in the output summary. MEAD extractive summaries score sentences according to certain sentence features - Centroid, Position, and Length. In this technique the score of a sentence is calculated using the following formula as follows

Open Document