It comprises of data clearing, for instance, handling missing values and elimination of noise or outliers. It will possibly involve using complex statistical techniques or a data mining algorithm. Data Integration The integration is one of the most significant features of data warehouse. Here, multiple data sources may be integrated. Data is given from multiple dissimilar sources into the data warehouse.
The large data set is classified into number of clusters and then the association rule mining techniques are applied to them to generate more efficient rules. It can help to reduce accident happening, find main factor and circumstances of causing accidents so that we can try to avoid
II. IN-NETWORK AGGREGATION TECHNIQUES We define the In-network aggregation process as follows: In-network aggregation is the global process of gathering and routing information through a multi-hop network, processing data at intermediate nodes with the objective of reducing resource consumption (in particular energy), thereby increasing network lifetime. We can distinguish the In-network aggregation process into two approaches as described below: 1. In-network aggregation with size reduction: refers to the process of combining and compressing data coming from different sources in order to reduce the information to be sent over the network. As an example, assume that a node receives two packets from two different sources containing the locally
o Data acquisition includes extraction of data from various sources, moving of data into data staging area, cleanliness of data, transformation of data and preparation of data for loading into data warehouse. o In data acquisition, data flow starts from sources and stops at data staging area. o Data storage includes the loading process of data from data staging area to data warehouse. Data flow begins at data staging area. o Information delivery provides the data warehouse information to data warehouse users.
What is Data Integration? Data integration implies combining of data from multiple sources into a coherent data store, as in data warehousing. These sources may include multiple databases, data cubes, or flat files. Data integration refers to the process and technologies for data movement from source Data Systems to Target Data Systems. On its way, data are usually transformed in order to fit business requirements.
Security and Privacy: Data security is major issues related with personal data and confidential data of organizations. User has to completely depend upon the cloud service provider for their data privacy and security. Technical Issues: High speed internet connectivity requirement makes the system complex. Various technical issues arise during high load. Data lock-in: The lack of standard APIs restricts the migration of applications and services between clouds.
Security monitoring allows: 1. Effective security protection on the network 2. Controlling of various malicious activities on the network 3. Detailed understanding of security infrastructure of the network On the other hand, there are some drawbacks related to security monitoring, such as: 1. Organizations must ensure to implement a genuine and licensed security monitoring tool to perform complete security monitoring with all the features embedded, if not there are high risks of security attacks on the network.
Consequently, the security of information has become a fundamental issue. Network security is becoming more important as the number of data being exchanged on the internet increases. Therefore, the confidentiality and the data integrity both require to be protected against unauthorized access and use. The result is the explosive growth in the field of information hiding. Stegnography hides the secret information within the host data set and reliably communicate it to a
The sensor network will provide big data that need to be analysed and thus detect any raise in the temperature. Each sensor is georeferenced in order to know exactly where fire eruption may occur. The acquired big data will be analysed online to provide prediction and detection of forest fires. Furthermore, the history of the data along with the erupted fires will be used, in the future, to build an intelligent system to predict naturally erupted fires that are due to summer heatwaves (e.g., via reinforced
INTRODUCTION In the field of data mining, it has been observed that the data grow rapidly. With the rapid growth of data and the availability an increasing number of electronic documents, the task of classification becomes a key method [1]. Document preprocessing is an important parameter and feature selection is a common problem used in preprocessing for machine learning, data mining and pattern recognition [1][2][3]. Text categorization has always been a hot topic due to explosive growth of digital documents available. Due to huge development information acquirement and storage, tens, hundreds and even thousands of features are acquired and stored in real world databases.