Abstract The advent of social networking and Internet of Things has resulted in an exponential increase in the volume of data. Simultaneously, the need to process and analyze the large volumes of data for business decision making has also increased. Many business and scientific applications need to process petabytes of data in efficient manner on daily basis. This data is categorized as "Big Data" due to its sheer Volume, Variety and Velocity and has resulted in a problem for the industry due to the inability of conventional database systems and software tools to manage and to process the big data sets within tolerable time limits. The scale, diversity, and complexity of Big Data require new architecture, techniques and algorithms to manage
Today's data comes from multiple sources, which makes it difficult to link, match, cleanse and transform data across systems. However, it’s necessary to connect and correlate relationships, hierarchies and multiple data linkages or your data can quickly spiral out of control. Why Is Big Data Important? The importance of big data doesn’t revolve around how much data you have, but what you do with it. You can take data from any source and analyze it to find answers that enable 1) cost reductions, 2) time reductions, 3) new product development and optimized offerings, and 4) smart decision making.
The efficient database management systems have been very crucial assets for management of a huge amount of data, especially for effective and efficient retrieval of particular information from a large collection whenever needed. The wide spread of database management systems has also influenced the recent massive gathering of all sorts of information. Today, we have far more information than we can handle, from business transactions and scientific data, to satellite pictures, text reports and military intelligence. Information retrieval is not enough anymore for decision-making. Faced with huge collections of data, we have now created new needs to help us make better managerial choices.
Complex data and knowledge associations:- Multistructure, multisource data is complex data, Examples of complex data types are bills of materials, word processing documents, maps, time-series, images and video. Such combined characteristics suggest that Big Data require a ―big mind to consolidate data for maximum
The amount of data collected from an organization determines the uses of Big Data concept in that organization. Mainly full functionalities of Big Data are achieved in organization where data resides in trillions of Exabyte’s. Database mainly stores data of one variety but with Big Data variety of data is stored in the repositories and move over search algorithms work evenly fine in all variety of data in real time scenarios. Velocity is one of the terms that makes Big Data the future of enterprise. Quick results is the requirement of the day today but preciseness of the data is also the main requirement.
Since, it encompasses wide range of activities, which most of time transcend factories or national boundary, complex interdependencies are built into it. As the power base continues to shift from companies towards customers, customer demands have gotten more complex. Companies are looking at Big Data analytics to revamp their supply chain, thereby using Big Data Analytics as a strategic lever. Companies are collecting vast amount of supply chain related data with help of technologies such as sensors, Barcode and GPS, Jacob House (2014). Big Data Analytics offers companies the ability to leverage on the enormous amounts of information driving their global supply chains, Harvard Business review, (2013).
In additionto analyzing huge amount of data, Big Data Analytics poses other unique challengesfor machine learning and data analysis, includes format variation of the raw data, trustworthiness of the data analysis, fast moving streaming data, noisy and poor quality data highly distributed inputsources, high dimensionality, scalability of algorithms, unsupervised and un-categorized data, limited supervised/labeled data, imbalanced input data, etc. Adequate data storage, data indexing or tagging, and fast information retrieval are other key problems in Big Data Analytics. Innovative data analysis and data management solutions are warranted when working with Big Data. For example, in a recent work we examined the high-dimensionality of bioinformatics domain data and investigated feature selection techniques to address the problem. A more detailed overview of Big Data Analytics is presented in “Big data
We can say that Data Mining need not be depended on Big Data as it can be done on small or large amount of data but big data surely depends on Data Mining because if we are not able to find the value/importance of large amount of data then that data is of no use. 4. Conclusion As we saw, Big data only refers to only large amount of data and all the big data solutions depends on the availability of data. It can be considered as the combination of Business Intelligence and Data
Abstract- Outlier detection is an active area for research in data set mining community. Finding outliers from a collection of patterns is a very well-known problem in data mining. Outlier Detection as a branch of data mining has many applications in data stream analysis and requires more attention. An outlier is a pattern which is dissimilar with respect to the rest of the patterns in the data set. Detecting outliers and analyzing large data sets can lead to discovery of unexpected knowledge in area such as fraud detection, telecommunication, web logs, and web document, etc.
The era has witnessed unparalleled growth in the number, availability and importance of images in all directions of life. As the large diversity and huge size of digital image collections have grown exponentially, an efficient image retrieval method is becoming increasingly significant and important too. From large image databases it is difficult to search and retrieve images with traditional text searches because the process of user based annotation has become very tedious and time consuming, as the text often fails to convey the rich structure of images. A prominent content-based retrieval system solves this problem significantly, where the retrieval is based on the automatic matching of the feature of the query image with that of the image database through some image-image similar parameters evaluation. Therefore, images will be indexed based to their own visual content parameters, such as texture, shape and color or any other feature or a combination of set of visual