Top Research Papers on Data Mining
Looking to deep dive into the world of data mining? Our curated list of top research papers on data mining offers valuable insights, methodologies, and breakthrough discoveries. Perfect for enthusiasts, researchers, and professionals in the field. Delve into the latest advancements and expand your knowledge base with these essential reads.
Looking for research-backed answers?Try AI Search
Data Mining and Machine Learning
129 Citations 2020Mohammed J. Zaki, Wagner Meira
Cambridge University Press eBooks
The fundamental algorithms in data mining and machine learning form the basis of data science, utilizing automated methods to analyze patterns and models for all kinds of data in applications ranging from scientific discovery to business analytics. This textbook for senior undergraduate and graduate courses provides a comprehensive, in-depth overview of data mining, machine learning and statistics, offering solid guidance for students, researchers, and practitioners. The book lays the foundations of data analysis, pattern mining, clustering, classification and regression, with a focus on the a...
Data augmentation in microscopic images for material data mining
113 Citations 2020Boyuan Ma, Xiaoyan Wei, Chuni Liu + 11 more
npj Computational Materials
A novel transfer learning strategy to address problems of small or insufficient data by fusing the images obtained from simulating the physical mechanism of grain formation and the “image style” information in real images to generate synthetic data.
Text Mining in Big Data Analytics
301 Citations 2020Hossein Hassani, Christina Beneki, Stephan Unger + 2 more
Big Data and Cognitive Computing
The state-of-the-art text mining approaches and techniques used for analyzing transcripts and speeches, meeting transcripts, and academic journal articles, as well as websites, emails, blogs, and social media platforms, are investigated.
Fast Utility Mining on Sequence Data
101 Citations 2020Wensheng Gan, Jerry Chun‐Wei Lin, Jiexiong Zhang + 3 more
IEEE Transactions on Cybernetics
An efficient algorithm for the task of HUSP mining with UL-list (HUSP-ULL), which utilizes a lexicographic $q$ -sequence (LQS)-tree and a utility-linked (UL)-list structure to quickly discover HUSPs.
Deep learning in mining biological data
421 Citations 2021Mufti Mahmud, M. Shamim Kaiser, T.M. McGinnity + 1 more
Nottingham Trent University's Institutional Repository (Nottingham Trent Repository)
Focusing on the use of DL to analyse patterns in data from diverse biological domains, different DL architectures’ applications to these data are investigated and a meta-analysis has been performed and the resulting resources have been critically analysed.
Big data management in the mining industry
194 Citations 2020Chongchong Qi
International Journal of Minerals Metallurgy and Materials
A brief introduction to big data and BDM is provided and the precautions for the utilization of BDM in the mining industry are outlined, and a future in which a global database project is established and big data is used together with other technologies supported by government policies and following international standards is envisioned.
Machine learning and data mining in manufacturing
623 Citations 2020Alican Doğan, Derya Birant
Expert Systems with Applications
A comprehensive literature review is presented to provide an overview of how machine learning techniques can be applied to realize manufacturing mechanisms with intelligent actions and points to several significant research questions that are unanswered in the recent literature having the same target.
Big Data Analysis and Perturbation using Data Mining Algorithm
177 Citations 2021Haoxiang Wang, S. Smys
Journal of Soft Computing Paradigm
Experimental analysis indicates that the proposed work is more successful in terms of attack resistance, scalability, execution speed and accuracy when compared with other algorithms that are used for privacy preservation.
Mining Facebook Data for Predictive Personality Modeling
134 Citations 2021Dejan Markovikj, Sonja Gievska, Michał Kosiński + 1 more
Proceedings of the International AAAI Conference on Web and Social Media
This paper explores the feasibility of modeling user personality based on a proposed set of features extracted from the Facebook data, and explores the suitability and performance of several classification techniques.
A global-scale data set of mining areas
250 Citations 2020Victor Maus, Stefan Giljum, Jakob Gutschlhofer + 6 more
Scientific Data
The area used for mineral extraction is a key indicator for understanding and mitigating the environmental impacts caused by the extractive sector. To date, worldwide data products on mineral extraction do not report the area used by mining activities. In this paper, we contribute to filling this gap by presenting a new data set of mining extents derived by visual interpretation of satellite images. We delineated mining areas within a 10 km buffer from the approximate geographical coordinates of more than six thousand active mining sites across the globe. The result is a global-scale data set ...
Mining Big Data in Education: Affordances and Challenges
373 Citations 2020Christian Fischer, Zachary A. Pardos, Ryan S. Baker + 6 more
Review of Research in Education
This chapter outlines current challenges of accessing, analyzing, and using big data and argues that addressing these challenges is worthwhile given the potential benefits of mining big data in education.
Brief introduction of medical database and data mining technology in big data era
576 Citations 2020Jin Yang, Yuanjie Li, Qingqing Liu + 6 more
Journal of Evidence-Based Medicine
This work has introduced several databases and data mining techniques to help a wide range of clinical researchers better understand and apply database technology.
The Secondary Use of Electronic Health Records for Data Mining: Data Characteristics and Challenges
109 Citations 2022Tabinda Sarwar, Sattar Seifollahi, Jeffrey Chan + 5 more
ACM Computing Surveys
An overview of information found in EHR systems and their characteristics that could be utilized for secondary applications is provided and can serve as a primer for researchers to understand the use of EHRs for data mining and analytics purposes.
Internet of things and data mining: An application oriented survey
106 Citations 2020Priyank Sunhare, Rameez Raja Chowdhary, Manju K. Chattopadhyay
Journal of King Saud University - Computer and Information Sciences
A systematic and detailed review of various data mining techniques employed in the large and small scale IoT applications to formulate an intelligent environment and an overview of cloud-assisted IoT Big data mining system are presented to better understand the importance of data mining for an IoT environment.
A community resource for paired genomic and metabolomic data mining
124 Citations 2021Michelle Schorn, Stefan Verhoeven, Lars Ridder + 107 more
Nature Chemical Biology
The Paired Omics Data Platform is a community initiative to systematically document links between metabolome and (meta)genome data, aiding identification of natural product biosynthetic origins and metabolite structures.
Educational data mining and learning analytics: An updated survey
891 Citations 2020Cristóbal Romero, Sebastián Ventura
Wiley Interdisciplinary Reviews Data Mining and Knowledge Discovery
The current state of the art in data mining in education is provided by reviewing the main publications, the key milestones, the knowledge discovery cycle, the main educational environments, the specific tools, the free available datasets, the most used methods, themain objectives, and the future trends in this research area.
Deep Learning for Spatio-Temporal Data Mining: A Survey
713 Citations 2020Senzhang Wang, Jiannong Cao, Philip S. Yu
IEEE Transactions on Knowledge and Data Engineering
A comprehensive review of recent progress in applying deep learning techniques for spatio-temporal data mining (STDM) in different domains including transportation, on-demand service, climate & weather analysis, human mobility, location-based social network, crime analysis, and neuroscience is provided.
Next-Generation Morphometry for pathomics-data mining in histopathology
122 Citations 2023David L. Hölscher, Nassim Bouteldja, Mehdi Joodaki + 14 more
Nature Communications
This study provides a concept for Next-generation Morphometry (NGM), enabling comprehensive quantitative pathology data mining, i.e., pathomics, and shows that the extracted features are independent predictors of long-term clinical outcomes in IgA-nephropathy.
Spatiotemporal data mining: a survey on challenges and open problems
133 Citations 2022Ali Hamdi, Khaled Shaban, Abdelkarim Erradi + 3 more
RMIT Research Repository (RMIT University Library)
This work investigates the challenging issues in regards to spatiotemporal relationships, interdisciplinarity, discretisation, and data characteristics related to STDM tasks of classification, clustering, hotspot detection, association and pattern mining, outlier detection, visualisation, visual analytics, and computer vision tasks.
Proceedings of the 2020 SIAM International Conference on Data Mining
312 Citations 2020Demeniconi, Carlotta, Chawla, Nitesh V., SIAM International Conference on Data Mining 2020 Cincinnati, Ohio
Society for Industrial and Applied Mathematics eBooks
Data mining is an important tool in science, engineering, industrial processes, healthcare, business, and medicine. The datasets in these fields are large, complex, and often noisy. Extracting knowledge requires the use of sophisticated, high performance and principled analysis techniques and algorithms, based on sound theoretical and statistical foundations. These techniques in turn require implementations that are carefully tuned for performance; powerful visualization technologies; interface systems that are usable by scientists, engineers, and physicians as well as researchers; and infrast...