This tutorial presents the concept of data mining and aims at providing an understanding of the overall process and tools involved: how the process turns out, what can be done with it, what are the main techniques behind it, and which are the operational aspects.
Data mining (DM) is a folkloric denomination of a complex activity that aims at extracting synthesized and previously unknown information from large databases. It denotes also a multidisciplinary field of research and development of algorithms and software environments to support this activity in the context of real-life problems where often huge amounts of data are available for mining. There is a lot of publicity in this field and also different ways to see the things. Hence, depending on the viewpoints, DM is sometimes considered as just a step in a broader overall process called knowledge discovery in databases (KDD), or as a synonym of the latter. This tutorial presents the concept of data mining and aims at providing an understanding of the overall process and tools involved: how the process turns out, what can be done with it, what are the main techniques behind it, and which are the operational aspects. The tutorial also describes a few examples of data mining applications, so as to motivate the power system field as a very opportune data mining application.