login
Home / Papers / What Is Data Mining?

What Is Data Mining?

88 Citations•2023•
journal unavailable

This paper focuses on data mining, which has been popularly treated as a synonym of knowledge discovery in databases, although some researchers view data mining as an essential step ofknowledge discovery.

Abstract

Data mining is the process of discovering interesting knowledge, such a s patterns, associations, changes, anomalies and signiicant structures, from large amounts of data stored in databases, data warehouses, or other information repositories. Due to the wide availability o f h uge amounts of data in electronic forms, and the imminent need for turning such data into useful information and knowledge for broad applications including market analysis , business management, and decision support, data mining has attracted a great deal of attention in information industry in recent y ears 4,5]. Data mining has been popularly treated as a synonym of knowledge discovery in databases, although some researchers view data mining as an essential step of knowledge discovery. In general, a knowledge discovery process consists of an iterative sequence of the following steps: data integration, where multiple, heterogeneous data sources may b e integrated into one, data selection, where data relevant to the analysis task are retrieved from the database, data transformation, where data are transformed or consolidated into forms appropriate for mining by performing summary or aggregation operations, data mining, which i s a n e s s e n tial process where intelligent methods are applied in order to extract data patterns, pattern evaluation, which i s t o i d e n tify the truly interesting patterns representing knowledge based on some interestingness measures, a n d knowledge presentation, where visualization and knowledge representation techniques are used to present the mined knowledge to the user. With the widely available relational database systems and data warehouses , the rst four processes: data cleaning, data integration, data selection , a n d data transformation, can be performed by constructing data warehouses and performing some OLAP operations on the constructed data 1