Data Mining In Computer Science

2594 Words11 Pages

CHAPTER 2 DATA MINING TECHNIQUE OVERVIEW 2.1 Introduction In the 21st century as we are moving towards more and more online system, the databases have grown into terabytes. Within this huge data, information of importance needs to be identified. Since the evolution of human life, the people discover patterns. As farmer recognizes pattern of growth in the field, bank recognizes the earning and spending pattern of a customer and politicians seeks pattern in voter opinion. This huge amount of data needs to be used either for business growth or scientific discoveries. The process of discovering the patterns and relationships in data using the analysis tools is called Data Mining. The simplest form of data mining is as follows: 1. Describing …show more content…

Data Mining helps us in taking appropriate decisions at appropriate time, to increase the profit of business. Data mining is highly related with another important area of research in Computer Science, namely, Machine Learning. Machine Learning is the field of research where machine learns from the past data and takes informed and efficient decisions for future. In number of applications, for example, optical character recognition, one needs to build the past data in the form of training patterns. These training patterns are usually taken in such an efficient way that machine can take an appropriate decision in a situation when a previously unknown pattern presents itself. The training patterns are generally taken in the form of features extracted from data. In case of data mining, creation of these patterns is not generally required as we already have the data from where knowledge is to be discovered. We however, have to be able to extract efficient features from this data, so a decision can be …show more content…

In data mining the technique to solve the problem depends on the type of problem. Some techniques are more suitable than the others in terms of expensive search and prediction error. Classification tree is not suited for the problem with true decision boundaries between the classes. Michalski and Kaufman describes the applicability of machine learning and multi strategy methodology to data mining. The multi strategy is used for conceptual data exploration that is finding out high level concept and description from data. The issue of having noise in the data is one of the challenges [53]. The other challenges are: 1. Learning dataset may or may not represent actual distribution pattern 2. Learning data may be in complete and some of the values of some attributes are unknown or missing 3. Learning set may be in distributed form. it means that learning database is a collection of datasets which are brought together and patterned within them needs to be identified. 4. Learning from the continuous evolving concept. It is seen some of dataset particularly related to the human being such as interest of user in choosing book is a changing over a period of

More about Data Mining In Computer Science

Open Document