login
Home / Papers / Data Mining

Data Mining

88 Citations•2011•
I. Alkadi
journal unavailable

No TL;DR found

Abstract

Recently data mining has become more popular in the information industry. It is due to the availability of huge amounts of data. Industry needs turning such data into useful information and knowledge. This information and knowledge can be used in many applications ranging from business management, production control, and market analysis, to engineering design and science exploration. Database and information technology have been evolving systematically from primitive file processing systems to sophisticated and powerful databases systems. The research and development in database systems has led to the development of relational database systems, data modeling tools, and indexing and data organization techniques. In relational database systems data are stored in relational tables. In addition, users can get convenient and flexible access to data through query languages, optimized query processing, user interfaces and transaction management and optimized methods for On-Line Transaction Processing (OLTP). The abundant data, which needs powerful data analysis tools, has been described as a data rich but information poor situation. The fast-growing, tremendous amount of data, collected and stored in large and numerous databases. Humans can not analyze these large amounts of data. So we need powerful tools to analyze this large amount of data. As a result, data collected in large databases become data tombs. These are data archives that are seldom visited. So, important decisions are often not made based on the information-rich data stored in databases rather based on a decision maker's intuition. This is because the decision maker does not have the tools to extract the valuable knowledge embedded in the vast amounts of data. Data mining tools which perform data analysis may uncover important data patterns, contributing greatly to business strategies, knowledge bases, and scientific and medical research. So data mining tools will turn data tombs into golden nuggets of knowledge. DEFINITION OF DATA MINING ata mining refers to extracting or mining knowledge from large amounts of data. Many people treat data mining as a synonym for another popularly used term, Knowledge Discovery in Databases or KDD. Alternatively, others view data mining as simply an essential step in the process of knowledge discovery in databases. Knowledge discovery as a process is depicted in the following figure, and consists of an iterative sequence of the following steps: 1. data cleaning to remove noise or irrelevant data. 2. data integration where multiple data sources may be combined. 3. data selection where data relevant to the analysis task are retrieved from the database. 4. data transformation where data are transformed or consolidated into forms appropriate for mining by performing summary or aggregation operations, for instance 5. data mining an essential process where intelligent methods are applied in order to extract data patterns 6. pattern evaluation to identify the truly interesting patterns representing knowledge based on some interesting measures. 7. knowledge presentation where visualization and knowledge representation techniques are used to present the mined knowledge to the user. The data mining step may interact with the user or a knowledge base. The interesting patterns are presented to the user, and may be stored as new knowledge in the knowledge base. We adopt a broad view of data mining D Review of Business Information Systems – First Quarter 2008 Volume 12, Number 1 18 functionality: data mining is the process of discovering interesting knowledge from large amounts of data stored either in databases, data warehouses, or other information repositories. Data Mining as a step in the process of Knowledge discovery MAJOR COMPONENTS OF DATA MINING SYSTEM Database, Data Warehouse, Or Other Information Repository This is one or a set of databases, data warehouses, spread sheets, or other kinds of information repositories. Data cleaning and data integration techniques may be performed on the data. Database Or Data Warehouse Server The database or data warehouse server is responsible for fetching the relevant data, based on the user's data mining request.