Top Research Papers on XGBoost
If you are eager to deepen your knowledge of XGBoost, this list of top research papers should be on your reading list. Gain valuable insights into this powerful machine learning algorithm that has revolutionized data science. Whether you're a beginner or an expert, these papers will provide essential understanding and advanced techniques, helping you to effectively leverage XGBoost in your projects.
Looking for research-backed answers?Try AI Search
Extreme gradient boosting (Xgboost) model to predict the groundwater levels in Selangor Malaysia
502 Citations 2021Ahmedbahaaaldin Ibrahem Ahmed Osman, Ali Najah Ahmed, Chow Ming Fai + 2 more
Ain Shams Engineering Journal
The proposed Xgboost model outperformed both the Artificial Neural Network and Support Vector Regression models for all different input combinations and serves as a great benchmark for future groundwater levels prediction using Xg Boost algorithm.
Fault detection by an ensemble framework of Extreme Gradient Boosting (XGBoost) in the operation of offshore wind turbines
148 Citations 2021Pavlos Trizoglou, Xiaolei Liu, Zi Lin
Renewable Energy
This study presented a novel data-driven approach to condition monitoring systems by utilizing the existing Supervisory Control And Data Acquisition (SCADA) system and integrating a wide range of machine learning and data mining techniques to design a Normal Behaviour Model of the generator for fault detection purposes.
Advanced hyperparameter optimization for improved spatial prediction of shallow landslides using extreme gradient boosting (XGBoost)
129 Citations 2022Taşkın Kavzoğlu, Alihan Teke
Bulletin of Engineering Geology and the Environment
Analysis of computational cost efficiency and AUC analysis showed that the Hyperband approach was much faster than the GA in hyperparameter tuning, and thus appeared to be the best optimization algorithm for the problem under consideration.
Predictive Performances of Ensemble Machine Learning Algorithms in Landslide Susceptibility Mapping Using Random Forest, Extreme Gradient Boosting (XGBoost) and Natural Gradient Boosting (NGBoost)
298 Citations 2022Taşkın Kavzoğlu, Alihan Teke
Arabian Journal for Science and Engineering
This work intended to propose natural gradient boosting (NGBoost), a novel member of the ensemble learning family, for modeling landslide susceptibility for Macka County of Trabzon province, Turkey, and indicated that the NGBoost method utilized for landslide susceptibility mapping problem for the first time had the greatest predictive ability.
A tree based eXtreme Gradient Boosting (XGBoost) machine learning model to forecast the annual rice production in Bangladesh
102 Citations 2023Mst. Noorunnahar, Arman Hossain Chowdhury, Farhana Arefeen Mila
PLoS ONE
It is found that the XGBoost model performs better than the ARIMA model in predicting the annual rice production in Bangladesh, and based on the better performance, the study forecasted the annual Rice Production in Bangladesh for the next 10 years using the XTBOost model.
eXtreme Gradient Boosting Algorithm with Machine Learning: a Review
202 Citations 2023Zeravan Arif Ali, Ziyad H. Abduljabbar, Hanan A. Tahir + 2 more
Academic Journal of Nawroz University
This paper presents one of the most prominent supervised and semi-supervised learning (SSL) machine learning algorithms in a Python environment, XGBoost, which is a parallel tree boost that addresses a variety of data science problems quickly and accurately.
Electricity Theft Detection Base on Extreme Gradient Boosting in AMI
167 Citations 2021Zhongzong Yan, He Wen
IEEE Transactions on Instrumentation and Measurement
Metering data from the advanced metering infrastructure can be used to find abnormal electricity behavior for the detection of electricity theft, which causes huge financial losses to electric companies every year. This article proposes an electricity theft detector using metering data based on extreme gradient boosting (XGBoost). The metering data are preprocessed, including recover missing and erroneous values and normalization. The classification model based on XGBoost is trained using both benign and malicious samples. Simulations are done by using the Irish Smart Energy Trails data set wi...
An effective adaptive customization framework for small manufacturing plants using extreme gradient boosting-XGBoost and random forest ensemble learning algorithms in an Industry 4.0 environment
141 Citations 2021Kahiomba Sonia Kiangala, Zenghui Wang
Machine Learning with Applications
An effective adaptive customization platform that encodes the customization data history of a small manufacturing plant, from a static database, into a dynamic machine learning model to produce personalized products for their customers accurately is developed.
Extreme Gradient Boosting for yield estimation compared with Deep Learning approaches
114 Citations 2022Florian Huber, Artem Yushchenko, Benedikt Stratmann + 1 more
Computers and Electronics in Agriculture
A comparative evaluation of soybean yield prediction within the United States shows promising prediction accuracies compared to state-of-the-art yield prediction systems based on Deep Learning.
Modeling hydrogen solubility in hydrocarbons using extreme gradient boosting and equations of state
101 Citations 2021Mohammad-Reza Mohammadi, Fahime Hadavimoghaddam, Maryam Pourmahdi + 5 more
Scientific Reports
The XGBoost model introduced in this study is a promising model that can be applied as an efficient estimator for hydrogen solubility in various hydrocarbons and is capable of being utilized in the chemical and petroleum industries.
A neural network boosting regression model based on XGBoost
163 Citations 2022Jianwei Dong, Yumin Chen, Bingyu Yao + 2 more
Applied Soft Computing
The boosting model is a kind of ensemble learning technology, including XGBoost and GBDT, which take decision trees as weak classifiers and achieve better results in classification and regression problems. The neural network has an excellent performance on image and voice recognition, but its weak interpretability limits on developing a fusion model. By referring to principles and methods of traditional boosting models, we proposed a Neural Network Boosting (NNBoost) regression, which takes shallow neural networks with simple structures as weak classifiers. The NNBoost is a new ensemble learni...
Diagnostic classification of cancers using extreme gradient boosting algorithm and multi-omics data
197 Citations 2020Baoshan Ma, Fanyu Meng, Yan Ge + 3 more
Computers in Biology and Medicine
Comparative experiments demonstrated that the XGBoost method has a remarkable performance in predicting the stage of cancer patients with multi-omics data and identification of novel candidate genes associated with cancer stages would contribute to further elucidate disease pathogenesis and develop novel therapeutics.
Extreme Gradient Boosting-Based Machine Learning Approach for Green Building Cost Prediction
131 Citations 2022Odey Alshboul, Ali Shehadeh, Ghassan Almasabha + 1 more
Sustainability
This study presents machine learning-based algorithms, including extreme gradient boosting (XGBOOST), deep neural network (DNN), and random forest (RF), to predict green building costs, designed to consider the influence of soft and hard cost-related attributes.
An investigation of feature selection methods for soil liquefaction prediction based on tree-based ensemble algorithms using AdaBoost, gradient boosting, and XGBoost
157 Citations 2022Selçuk Demir, Emrehan Kutluğ Şahin
Neural Computing and Applications
The study suggests that all developed tree-based ensemble models could reliably estimate soil liquefaction and the XGBoost with the Boruta model achieved the most stable and better prediction performance than the other models in all considered cases.
Streamflow forecasting using extreme gradient boosting model coupled with Gaussian mixture model
249 Citations 2020Lingling Ni, Dong Wang, Jianfeng Wu + 4 more
Journal of Hydrology
It can be inferred that XGBoost is applicable for streamflow forecasting, and in general, performs better than SVM; the cluster analysis-based modular model is helpful in improving accuracy and capturing the complicated patterns of hydrological process.
Assessing the predictive capability of ensemble tree methods for landslide susceptibility mapping using XGBoost, gradient boosting machine, and random forest
357 Citations 2020Emrehan Kutluğ Şahin
SN Applied Sciences
This study produces landslide susceptibility map of the Ayancik district of Sinop province, situated in the Black Sea region of Turkey using three featured regression tree-based ensemble methods including gradient boosting machines (GBM), extreme gradient boosting (XGBoost), and random forest (RF).
Efficient reliability analysis of earth dam slope stability using extreme gradient boosting method
267 Citations 2020Lin Wang, Chongzhi Wu, Libin Tang + 4 more
Acta Geotechnica
Reliability analysis approach provides a rational means to quantitatively evaluate the safety of geotechnical structures from a probabilistic perspective. However, it suffers from a known criticism of extensive computational requirements and poor efficiency, which hinders its application in the reliability analysis of earth dam slope stability. Until now, the effects of spatially variable soil properties on the earth dam slope reliability remain unclear. This calls for a novel method to perform reliability analysis of earth dam slope stability accounting for the spatial variability of soil pro...
Explainable extreme gradient boosting tree-based prediction of load-carrying capacity of FRP-RC columns
129 Citations 2021Abdoulaye Sanni Bakouregui, Hamdy M. Mohamed, Ammar Yahia + 1 more
Engineering Structures
This study presents a new approach for predicting the load-carrying capacity of reinforced concrete (RC) columns reinforced with fiber-reinforced polymer (FRP) bars with an eXtreme Gradient Boosting (XGBoost) algorithm. The proposed XGBoost model was developed based on a comprehensive database containing experimental data for 283 FRP-RC columns collected from the literature. The SHapley Additive exPlanations (SHAP) framework was used to interpret the output of the model. Furthermore, the efficiency and accuracy of the XGBoost model were evaluated and compared with design codes and equations in...
Predicting the compressive strength of concrete from its compositions and age using the extreme gradient boosting method
176 Citations 2020Tuan Nguyen‐Sy, Jad Wakim, Quy‐Dong To + 3 more
Construction and Building Materials
It is demonstrated that the UCS of concrete can be accurately predicted from its compositions and age using the extreme gradient boosting regression (XGB) method, which is more robust, faster to train and more accurate than the ANN and SVM methods as well as other existent ML methods presented in literature.
Using Shapley additive explanations to interpret extreme gradient boosting predictions of grassland degradation in Xilingol, China
104 Citations 2021Ralf Wieland, Tobia Lakes, Claas Nendel + 1 more
Geoscientific model development
The results indicated that, with three of the sampling strategies, XGBoost achieved similar and robust simulation results, and SHAP values were useful for analysing the complex relationship between the different drivers of grassland degradation.
Predicting algal biochar yield using eXtreme Gradient Boosting (XGB) algorithm of machine learning methods
204 Citations 2020Abhijeet Pathy, Saswat Meher, P. Balasubramanian
Algal Research
Pyrolysis is a thermochemical pathway widely used for the conversion of biomass into useful products such as biochar, bio-oil, and syngases. A recent surge in the adoption of the pyrolysis process at realtime scenarios for the appropriate management and conversion of residues demands the modeling of the pyrolysis process. Prediction of algal biochar yield along with its composition was attempted in this study with the eXtreme Gradient Boosting (XGB) machine learning method. An extensive grid search method has been implemented in the XGB model to explore all the possible considered input parame...
Development of extreme gradient boosting model for prediction of punching shear resistance of r/c interior slabs
105 Citations 2021Hoang D. Nguyen, Gia Toai Truong, Myoungsu Shin
Engineering Structures
This paper aims to present the application of extreme gradient boosting (XGBoost) to the prediction of the punching shear resistance of reinforced concrete (R/C) interior slabs without shear reinforcement. For the training and testing of the XGBoost model, which was developed using the XGBoost 1.1.1 package, 497 experimental data of interior slab-column connections were collected from the literature. The input variables were the column section dimension, slab effective depth, concrete compressive strength, steel yield strength, and reinforcement ratio at the top and bottom of the slab. The tar...
Prediction of undrained shear strength using extreme gradient boosting and random forest based on Bayesian optimization
787 Citations 2020Wengang Zhang, Chongzhi Wu, Haiyi Zhong + 2 more
Geoscience Frontiers
Novel data-driven extreme gradient boosting (XGBoost) and random forest ensemble learning methods are applied for capturing the relationships between the USS and various basic soil parameters to predict undrained shear strength of soft clays.
Non-linear associations between built environment and active travel for working and shopping: An extreme gradient boosting approach
209 Citations 2021Jixiang Liu, Bo Wang, Longzhu Xiao
Journal of Transport Geography
Active travel has environmental, social, and public health-related benefits. Researchers from diverse domains have extensively studied built-environment associations with active travel. However, limited attention has been paid to distinguishing the associations between built environment characteristics at both the origins and destinations and active travel for working and shopping. Scholars have started to examine non-linear associations of built environment with travel behaviour, but active travel has seldom been a focus. Therefore, this study, selecting Xiamen, China, as the case, utilises a...
Extreme gradient boosting and deep neural network based ensemble learning approach to forecast hourly solar irradiance
249 Citations 2020Pratima Kumari, Durga Toshniwal
Journal of Cleaner Production
The proposed ensemble model, which consists of two advance base models, namely extreme gradient boosting forest and deep neural networks (XGBF-DNN), is proposed for hourly global horizontal irradiance forecast and exhibits the best combination of stability and prediction accuracy irrespective of seasonal variations in weather conditions.
A comparative analysis of gradient boosting algorithms
2175 Citations 2020Candice Bentéjac, Anna Csörgő, Gonzalo Martínez-Muñoz
Artificial Intelligence Review
A comprehensive comparison between XGBoost, LightGBM, CatBoost, random forests and gradient boosting has been performed and indicates that CatBoost obtains the best results in generalization accuracy and AUC in the studied datasets although the differences are small.
Predicting occurrence of liquefaction-induced lateral spreading using gradient boosting algorithms integrated with particle swarm optimization: PSO-XGBoost, PSO-LightGBM, and PSO-CatBoost
113 Citations 2023Selçuk Demir, Emrehan Kutluğ Şahin
Acta Geotechnica
The use of these boosting algorithms especially optimized with PSO is recommended for predicting the occurrence of liquefaction-induced lateral spreading, and PSO-CatBoost outperformed other state-of-the-art models in terms of performance metrics.
Assessment of basal heave stability for braced excavations in anisotropic clay using extreme gradient boosting and random forest regression
116 Citations 2020Wengang Zhang, Runhong Zhang, Chongzhi Wu + 2 more
Underground Space
A finite-element analysis considering the anisotropy for the undrained shear strength was performed to examine the effects of the total stress-based anisotropic model NGI-ADP (developed by Norwegian Geotechnical Institute based on the Active-Direct simple shear-Passive concept) parameters on the base stability of deep braced excavations in clays. These parameters included the ratio of the plane strain passive shear strength to the plane strain active shear strength suP/suA, the ratio of the unloading/reloading shear modulus to the plane strain active shear strength Gur/suA, the plane strain ac...
Prediction of seismic drift responses of planar steel moment frames using artificial neural network and extreme gradient boosting
101 Citations 2021Hoang D. Nguyen, Nhan D. Dao, Myoungsu Shin
Engineering Structures
This study aims to develop machine learning (ML) models that can predict the seismic responses of planar steel moment-resisting frames subjected to ground motions. For this purpose, two of the most powerful ML techniques, artificial neural network (ANN) and extreme gradient boosting (XGBoost), were applied. To generate a comprehensive dataset for the training and testing of the ML models, 22,464 nonlinear dynamic analyses were conducted on 36 steel moment frames with different structural characteristics (i.e., number of stories, number of bays, column-to-beam moment capacity ratio) subjected t...
Developing a hybrid model of Jaya algorithm-based extreme gradient boosting machine to estimate blast-induced ground vibrations
176 Citations 2021Jian Zhou, Yingui Qiu, Manoj Khandelwal + 2 more
International Journal of Rock Mechanics and Mining Sciences
Findings reveal that the proposed Jaya-XGBoost emerged as the most reliable model in contrast to other machine learning models and traditional empirical models.
Flash-flood susceptibility mapping based on XGBoost, random forest and boosted regression trees
256 Citations 2021Rahebeh Abedi, Romulus Costache, Hossein Shafizadeh‐Moghadam + 1 more
Geocarto International
The results showed that the central part of the Bâsca Chiojdului river basin, which covers approximately 30% of the study area, is more susceptible to flash flooding.
NGBoost: Natural Gradient Boosting for Probabilistic Prediction
122 Citations 2020Tony Duan, Anand Avati, Daisy Yi Ding + 4 more
International Conference on Machine Learning
NGBoost generalizes gradient boosting to probabilistic regression by treating the parameters of the conditional distribution as targets for a multiparameter boosting algorithm, and shows how the Natural Gradient is required to correct the training dynamics of the authors' multiparameters boosting approach.
Practical Federated Gradient Boosting Decision Trees
192 Citations 2020Qinbin Li, Zeyi Wen, Bingsheng He
Proceedings of the AAAI Conference on Artificial Intelligence
This paper studies a practical federated environment with relaxed privacy constraints, where a dishonest party might obtain some information about the other parties' data, but it is still impossible for the dishonest party to derive the actual raw data of other parties.
Estimation of daily maize transpiration using support vector machines, extreme gradient boosting, artificial and deep neural networks models
197 Citations 2020Junliang Fan, Jing Zheng, Lifeng Wu + 1 more
Agricultural Water Management
The incorporation of SWC or/and LAI in the machine learning models is highly recommended for accurate daily maize T estimation, and the DNN model is more effective for daily maizeT estimation due to its advantage in modeling high-order complex relationships between T and its driving variables through multiple levels of feature abstraction.
Short-term prediction of building energy consumption employing an improved extreme gradient boosting model: A case study of an intake tower
131 Citations 2020Hongfang Lü, Fei-Fei Cheng, Xin Ma + 1 more
Energy
A novel hybrid model is proposed for predicting short-term building energy consumption using complete ensemble empirical mode decomposition with adaptive noise, and the buildingEnergy consumption is predicted by the traditional extreme gradient boosting.
Application of extreme gradient boosting and parallel random forest algorithms for assessing groundwater spring potential using DEM-derived factors
134 Citations 2020Seyed Amir Naghibi, Hossein Hashemi, Ronny Berndtsson + 1 more
Journal of Hydrology
Groundwater (GW) resources provide a large share of the world's water demand for various sections such as agriculture, industry, and drinking water. Particularly in the arid and semi-arid regions, with surface water scarcity and high evaporation, GW is a valuable commodity. Yet, GW data are often incomplete or nonexistent. Therefore, it is a challenge to achieve a GW potential assessment. In this study, we developed methods to produce reliable GW potential maps (GWPM) with only digital elevation model (DEM)-derived data as inputs. To achieve this objective, a case study area in Iran was select...
Construction of a virtual PM2.5 observation network in China based on high-density surface meteorological observations using the Extreme Gradient Boosting model
156 Citations 2020Ke Gui, Huizheng Che, Zhaoliang Zeng + 10 more
Environment International
A virtual ground-based PM2.5 observation network based on high-density surface meteorological observations using the Extreme Gradient Boosting model shows great potential in reconstructing historical PM 2.5 data, and surface visibility plays the dominant role in terms of the relative importance of variables in the XGBoost model.
Desert Microbes for Boosting Sustainable Agriculture in Extreme Environments
180 Citations 2020Wiam Alsharif, Maged M. Saad, Heribert Hirt
Frontiers in Microbiology
The efforts to explore the bacterial diversity associated with desert plants in the arid, semi-arid, and hyper-arids regions are described, highlighting the latest discoveries and applications of plant growth promoting bacteria from the most studied deserts around the world.
Bagging–XGBoost algorithm based extreme weather identification and short-term load forecasting model
119 Citations 2022Xuzhi Deng, Aoshuang Ye, Jiashi Zhong + 13 more
Energy Reports
Accurate short-term load forecasting of distribution transformer in extreme weather will effectively assist power dispatching and enable safe and stable operation of power grid. Therefore, this paper proposes the Bagging–XGBoost algorithm based extreme weather identification and short-term load forecasting model, which can warn the time period and detailed value of peak load in advance. Firstly, based on Extreme Gradient Boosting (XGBoost) algorithm, the idea of Bagging is introduced to reduce the output variance and enhance the generalization ability of the algorithm. Then, the mutual informa...
GBDT-MO: Gradient-Boosted Decision Trees for Multiple Outputs
205 Citations 2020Zhendong Zhang, Cheolkon Jung
IEEE Transactions on Neural Networks and Learning Systems
This article proposes a general method to learn gradient-boosted decision trees for multiple outputs, called GBDT-MO, which achieves outstanding performance in terms of accuracy, training speed, and inference speed.