Top Research Papers on Data Structures and Algorithms PDF
Unlock a wealth of knowledge with our curated list of top research papers on Data Structures and Algorithms PDF. These papers offer in-depth insights and advanced understanding to boost your expertise in handling complex data structures and refining algorithms for efficient problem-solving. Perfect for students, researchers, or anyone eager to deepen their knowledge in this field.
Looking for research-backed answers?Try AI Search
A survey on data‐efficient algorithms in big data era
299 Citations 2021Amina Adadi
Journal Of Big Data
This work investigates the issue of algorithms’ data hungriness, presents a comprehensive review of existing data-efficient methods and systematizes them into four categories, and delineates the limitations, discusses research challenges, and suggests future opportunities to advance the research on data-efficiency in machine learning.
Boosting Data-Driven Evolutionary Algorithm With Localized Data Generation
147 Citations 2020Jian-Yu Li, Zhi‐Hui Zhan, Chuan Wang + 2 more
IEEE Transactions on Evolutionary Computation
A novel DDEA with two efficient components, a boosting strategy for self-aware model managements and a localized data generation method to generate synthetic data to alleviate data shortage and increase data quantity, which is achieved by approximating fitness through data positions.
Big Data Analysis and Perturbation using Data Mining Algorithm
177 Citations 2021Haoxiang Wang, S. Smys
Journal of Soft Computing Paradigm
Experimental analysis indicates that the proposed work is more successful in terms of attack resistance, scalability, execution speed and accuracy when compared with other algorithms that are used for privacy preservation.
Managing by Data: Algorithmic Categories and Organizing
104 Citations 2020Cristina Alaimo, Jannis Kallinikos
Organization Studies
This work conducts an empirical investigation of Last.fm, an online music discovery platform, and finds that data mining and data management techniques are increasingly permeate organizations and the contexts in which they are embedded.
System- and Data-Driven Methods and Algorithms
116 Citations 2021Benner, Peter 1967-, Grivet-Talocia, Stefano 1970-, Quarteroni, Alfio 1952-
journal unavailable
An increasing complexity of models used to predict real-world systems leads to the need for algorithms to replace complex models with far simpler ones, while preserving the accuracy of the predictions. This two-volume handbook covers methods as well as applications. This first volume focuses on real-time control theory, data assimilation, real-time visualization, high-dimensional state spaces and interaction of different reduction techniques.
Smart transportation planning: Data, models, and algorithms
119 Citations 2020Zahra Karami, Rasha Kashef
Transportation Engineering
Various machine learning techniques and models that use time series prediction are introduced in this paper including ARIMA, Kalman filtering, Holt winters'Exponential smoothing, Random walk, KNN Algorithm, and Deep Learning.
Recognizing how model design impacts harm opens up new mitigation techniques that are less burdensome than comprehensive data collection.
The improved AdaBoost algorithms for imbalanced data classification
197 Citations 2021Wenyang Wang, Dongchu Sun
Information Sciences
This paper proposes a method to improve the AdaBoost algorithm using the new weighted vote parameters for the weak classifiers using the basis of the global error rate and the classification accuracy rate of the positive class, which is the primary interest.
Genetic Algorithms in the Fields of Artificial Intelligence and Data Sciences
254 Citations 2021Ayesha Sohail
Annals of Data Science
The time series forecasting and the Bayesian inference, in combination with the genetic algorithms, can prove to be powerful artificial intelligence tools.
Algorithmic bias in data-driven innovation in the age of AI
371 Citations 2021Shahriar Akter, Grace McCarthy, Shahriar Sajib + 4 more
International Journal of Information Management
Data-driven innovation (DDI) gains its prominence due to its potential to transform innovation in the age of AI. Digital giants Amazon, Alibaba, Google, Apple, and Facebook, enjoy sustainable competitive advantages from DDI. However, little is known about algorithmic biases that may present in the DDI process, and result in unjust, unfair, or prejudicial data product developments. Thus, this guest editorial aims to explore the sources of algorithmic biases across the DDI process using a systematic literature review, thematic analysis and a case study on the Robo-Debt scheme in Australia. The f...
A fuzzy C-means algorithm for optimizing data clustering
120 Citations 2023Seyed Emadedin Hashemi, Fatemeh Gholian-Jouybari, Mostafa Hajiaghaei–Keshteli
Expert Systems with Applications
Big data has increasingly become predominant in many research fields affecting human knowledge, including medicine and engineering. Cluster analysis, or clustering, is widely recognized as one of the most effective processes to deal with various types of data, especially big data. There has been considerable interest in Fuzzy C-Means (FCM) as a method for clustering data using a short-distance approach in data mining. However, despite its simplicity, this method is not suitable for clustering large data sets due to their complex structure. In particular, FCM is sensitive to cluster center init...
Crystal Structure Algorithm (CryStAl): A Metaheuristic Optimization Method
165 Citations 2021Siamak Talatahari, Mahdi Azizi, Mohammad Tolouei + 2 more
IEEE Access
This paper proposes a novel metaheuristic called CryStAl, chiefly inspired by the principles underlying the formation of crystal structures from the addition of the basis to the lattice points, which is a natural phenomenon that can be seen in the symmetric arrangement of constituents in crystalline minerals such as quartz.
Superresolution structured illumination microscopy reconstruction algorithms: a review
181 Citations 2023Xin Chen, Suyi Zhong, Yiwei Hou + 6 more
Light Science & Applications
The basic theory of two SIM algorithms, namely, optical sectioning SIM (OS-SIM) and superresolution SIM (SR-SIM), are introduced, and their implementation modalities are summarized.
Challenges in benchmarking stream learning algorithms with real-world data
131 Citations 2020Vinicius M. A. Souza, Denis M. dos Reis, André G. Maletzke + 1 more
Data Mining and Knowledge Discovery
This paper proposes a new public data repository for benchmarking stream algorithms with real-world data that contains the most popular datasets from literature and new datasets related to a highly relevant public health problem that involves the recognition of disease vector insects using optical sensors.
Public health utility of cause of death data: applying empirical algorithms to improve data quality
151 Citations 2021Sarah Charlotte Johnson, Matthew Cunningham, Ilse N Dippenaar + 88 more
BMC Medical Informatics and Decision Making
The pattern of garbage-coded deaths in the world is identified and the methods used to determine their redistribution to generate more plausible cause of death assignments are presented to represent an overall improvement in empiricism compared to past reliance on a priori knowledge.
Multimodal medical image fusion algorithm in the era of big data
193 Citations 2020Wei Tan, Prayag Tiwari, Hari Mohan Pandey + 2 more
Neural Computing and Applications
Qualitative and quantitative evaluation verifies that the proposed algorithm outperforms most of the current algorithms, providing important ideas for medical diagnosis.
An emergent algorithmic culture: The data-ization of online fandom in China
115 Citations 2020Yiyi Yin
International Journal of Cultural Studies
This article portrays the data-ization of online fandom in China, arguing that the traffic data has been dematerialized as new affective object in fan–object relations, while digital fan culture has been constructed into a type of algorithmic culture.
CryptoGA: a cryptosystem based on genetic algorithm for cloud data security
142 Citations 2020Muhammad Tahir, Muhammad Sardaraz, Zahid Mehmood + 1 more
Cluster Computing
Experimental results analysis show that the proposed model, CryptoGA, is robust and provides better performance on selected parameters as compared to state-of-the-art cryptographic algorithms i.e. DES, 3DES, RSA, Blowfish, and AES.
Research on expansion and classification of imbalanced data based on SMOTE algorithm
196 Citations 2021Shujuan Wang, Yuntao Dai, Jihong Shen + 1 more
Scientific Reports
An improved SMOTE algorithm based on Normal distribution is proposed in this paper, so that the new sample points are distributed closer to the center of the minority sample with a higher probability to avoid the marginalization of the expanded data.
Data-Driven Evolutionary Algorithm With Perturbation-Based Ensemble Surrogates
133 Citations 2020Jian-Yu Li, Zhi‐Hui Zhan, Hua Wang + 1 more
IEEE Transactions on Cybernetics
The experimental results on widely used benchmarks and an aerodynamic airfoil design real-world optimization problem show that the proposed DDEA-PES algorithm outperforms some state-of-the-art DDEAs and only requires about 2% computational budgets to produce competitive results.
Identifying Schizophrenia Using Structural MRI With a Deep Learning Algorithm
129 Citations 2020Jihoon Oh, Baek‐Lok Oh, Kyong-Uk Lee + 2 more
Frontiers in Psychiatry
The deep learning algorithm showed good performance in detecting schizophrenia and identified relevant structural features from structural brain MRI data; it had an acceptable classification performance in a separate group of patients at an earlier stage of the disease.
Crystal structure prediction by combining graph network and optimization algorithm
105 Citations 2022Guanjian Cheng, Xin-Gao Gong, Wan‐Jian Yin
Nature Communications
A machine-learning framework combining graph network and optimization algorithms for crystal structure prediction, which is about three orders of magnitude faster than DFT-based approach is proposed.
A novel feature selection method for data mining tasks using hybrid Sine Cosine Algorithm and Genetic Algorithm
104 Citations 2021Laith Abualigah, Akram Jamal Dulaimi
Cluster Computing
The proposed SCAGA resulted in better performance when balancing between exploitation and exploration strategies of the search space and was the best overall the tested datasets from the UCI machine learning repository.
LISA: A Learned Index Structure for Spatial Data
137 Citations 2020Pengfei Li, Hua Lu, Qian Zheng + 2 more
journal unavailable
This work proposes a novel Learned Index structure for Spatial dAta (LISA for short), which consists of a mapping function that maps spatial keys into 1-dimensional mapped values, a learned shard prediction function that partitions the mapped space into shards, and a series of local models that organize shards into pages.
Structural Racism, Health Inequities, and the Two-Edged Sword of Data: Structural Problems Require Structural Solutions
115 Citations 2021Nancy Krieger
Frontiers in Public Health
A new opportunity arises as US government agencies re-engage with their work, out of the shadow of white grievance politics cast by the Trump Administration, to move forward with this structural proposal to aid the work for health equity.
The limits of the imaginary: Challenges to intervening in future speculations of memory, data, and algorithms
156 Citations 2020Annette Markham
New Media & Society
A critical theory reading of the theme of inevitability is offered, using the concept of discursive closure, whereby the authors can see how particular values and (infra)structures are naturalized, neutralized, and legitimated, closing off discussion of alternatives that might counter current hegemonic power.
Effect of Data Scaling Methods on Machine Learning Algorithms and Model Performance
637 Citations 2021Md Manjurul Ahsan, M. A. Parvez Mahmud, Pritom Saha + 2 more
Technologies
CART, along with RS or QT, outperforms all other ML algorithms with 100% accuracy, 100% precision, 99% recall, and 100% F1 score, and the study outcomes demonstrate that the model’s performance varies depending on the data scaling method.
A Fast Abnormal Data Cleaning Algorithm for Performance Evaluation of Wind Turbine
221 Citations 2020Zhongju Wang, Long Wang, Chao Huang
IEEE Transactions on Instrumentation and Measurement
The computational results prove the proposed approach has achieved better performance in cleaning abnormal wind power data while the execution time is tremendously reduced and the proposed method is available and practical for real wind turbine power generation performance evaluation and monitoring tasks.
Review on the Application of Machine Learning Algorithms in the Sequence Data Mining of DNA
145 Citations 2020Aimin Yang, Wei Zhang, Jiahao Wang + 3 more
Frontiers in Bioengineering and Biotechnology
This review introduces the development process of sequencing technology, expounds on the concept of DNA sequence data structure and sequence similarity, and analyzes the basic process of data mining, several major machine learning algorithms, and puts forward the challenges faced by machineLearning algorithms in the mining of biological sequence data.
A Load Balancing Algorithm for the Data Centres to Optimize Cloud Computing Applications
173 Citations 2021Dalia Abdulkareem Shafiq, N. Z. Jhanjhi, Azween Abdullah + 1 more
IEEE Access
The proposed LB algorithm is aimed to optimize resources and improve Load Balancing in view of the Quality of Service (QoS) task parameters, the priority of VMs, and resource allocation and achieves good performance in terms of less Execution time and Makespan.
A Noise Removal Algorithm Based on OPTICS for Photon-Counting LiDAR Data
106 Citations 2020Xiaoxiao Zhu, Sheng Nie, Cheng Wang + 4 more
IEEE Geoscience and Remote Sensing Letters
A novel algorithm based on the clustering method of ordering points to identify the clustered structure (OPTICS) was proposed to remove noise photons in the ICESat-2 data and shows that the algorithm works well in distinguishing the signal and noise photons as indicated by high $F$ values.
Exploiting problem structure in a genetic algorithm approach to a nurse rostering problem
203 Citations 2020Uwe Aickelin, Kathryn A. Dowsland
Minerva Access (University of Melbourne)
There is considerable interest in the use of genetic algorithms to solve problems arising in the areas of scheduling and timetabling. However, the classical genetic algorithm (GA) paradigm is not well equipped to handle the conflict between objectives and constraints that typically occur in such problems. In order to overcome this, successful implementations frequently make use of problem specific knowledge. This paper is concerned with the development of a GA for a nurse rostering problem at a major U.K. hospital. The structure of the constraints is used as the basis for a co-evolutionary str...
An Effective and Adaptable K-means Algorithm for Big Data Cluster Analysis
168 Citations 2023Haize Hu, Jianxun Liu, Xiangping Zhang + 1 more
Pattern Recognition
Tradition K-means clustering algorithm is easy to fall into local optimum, poor clustering effect on large capacity data and uneven distribution of clustering centroids. To solve these problems, a novel k-means clustering algorithm based on Lévy flight trajectory (Lk-means) is proposed in the paper. In the iterative process of LK-means algorithm, Lévy flight is used to search new positions to avoid premature convergence in clustering. It is also applied to increase the diversity of the cluster, strengthen the global search ability of K-means algorithm, and avoid falling into the local optimal ...
A new lightweight cryptographic algorithm for enhancing data security in cloud computing
155 Citations 2021Fursan Thabit, Sharaf A. Alhomdy, Abdulrazzaq H. A. Al‐Ahdal + 1 more
Global Transitions Proceedings
A New Lightweight Cryptographic Algorithm for Enhancing Data Security that can be used to secure applications on cloud computing is proposed and presented a strong security level and an apparent enhancement in measures of cipher execution time and security forces compared to the cryptographic systems widely used in cloud computing.
Bridge condition rating data modeling using deep learning algorithm
101 Citations 2020Heng Liu, Yunfeng Zhang
Structure and Infrastructure Engineering
Research findings suggest that the deep learning model offers a promising tool as a data-driven condition forecasting approach for bridge components with a demonstrated prediction accuracy over 85%.
Lack of Transparency and Potential Bias in Artificial Intelligence Data Sets and Algorithms
311 Citations 2021Roxana Daneshjou, Mary P. Smith, Mary Sun + 2 more
JAMA Dermatology
This scoping review identified 3 issues in data sets used to develop and test clinical AI algorithms for skin disease that should be addressed before clinical translation: sparsity of data set characterization and lack of transparency, nonstandard and unverified disease labels, and inability to fully assess patient diversity used for algorithm development and testing.
Machine Learning Algorithms in Civil Structural Health Monitoring: A Systematic Review
464 Citations 2020Majdi Flah, Itzel Nunez, Wassim Ben Chaabene + 1 more
Archives of Computational Methods in Engineering
The efficacy of deploying ML algorithms in SHM has been discussed and detailed critical analysis of ML applications in SHm has been provided, practical recommendations have been made and current knowledge gaps and future research needs have been outlined.
Generative machine learning algorithm for lattice structures with superior mechanical properties
129 Citations 2022Sangryun Lee, Zhizhou Zhang, Grace X. Gu
Materials Horizons
We present a hybrid neural network and genetic optimization adaptive method incorporating Bézier curves to consider the large design space of lattice structures with superior mechanical properties.
Improved arithmetic optimization algorithm and its application to discrete structural optimization
108 Citations 2021A. Kaveh, Kiarash Biabani Hamedani
Structures
The arithmetic optimization algorithm (AOA) is a newly developed metaheuristic search technique that simulates the distribution characteristics of the basic arithmetic operations of addition, subtraction, multiplication, and division, and has been employed to solve some real-world optimization problems. However, it has been found that the AOA suffers from poor exploration and prematurely converges to non-optimal solutions, especially when applied to multi-dimensional optimization problems. In this paper, to overcome the shortcomings of the standard AOA, an improved variant of the AOA, called i...
Using Aggregated Relational Data to Feasibly Identify Network Structure without Network Data
101 Citations 2020Emily Breza, Arun G. Chandrasekhar, Tyler H. McCormick + 1 more
American Economic Review
This work proposes an inexpensive and feasible strategy for network elicitation using Aggregated Relational Data (ARD): responses to questions of the form "how many of your links have trait k ?"