Top Research Papers on Data Engineering
Dive into a curated selection of the most influential research papers on Data Engineering. This collection covers groundbreaking approaches, methodologies, and applications that are shaping the future of this critical field. Expand your knowledge and keep up with the latest trends and innovations in Data Engineering.
Looking for research-backed answers?Try AI Search
Data-Driven Science and Engineering
564 Citations 2022Steven L. Brunton, J. Nathan Kutz
Cambridge University Press eBooks
Data-driven discovery is revolutionizing how we model, predict, and control complex systems. Now with Python and MATLAB®, this textbook trains mathematical scientists and engineers for the next generation of scientific discovery by offering a broad overview of the growing intersection of data-driven methods, machine learning, applied optimization, and classical fields of engineering mathematics and mathematical physics. With a focus on integrating dynamical systems modeling and control with modern methods in applied machine learning, this text includes methods that were chosen for their releva...
Data engineering for fraud detection
135 Citations 2021Bart Baesens, Sebastiaan Höppner, Tim Verdonck
Decision Support Systems
This work proposes several data engineering techniques to improve the performance of an analytical model while retaining the interpretability property, and illustrates the improvement in performance of these data engineering steps for popular analytical models on a real payment transactions data set.
The R Language: An Engine for Bioinformatics and Data Science
200 Citations 2022Federico M. Giorgi, Carmine Ceraolo, Daniele Mercatelli
Life
An historical chronicle of how R became what it is today is provided, describing all its current features and capabilities, and the role of R in science in general as a driver for reproducibility is discussed.
Automated data processing and feature engineering for deep learning and big data applications: A survey
102 Citations 2024Alhassan Mumuni, Fuseini Mumuni
Journal of Information and Intelligence
A thorough review of approaches for automating data processing tasks in deep learning pipelines, including automated data preprocessing, as well as data augmentation (including synthetic data generation using generative AI methods and feature engineering), and the use of AutoML methods and tools to simultaneously optimize all stages of the machine learning pipeline are presented.
Low-N protein engineering with data-efficient deep learning
383 Citations 2021Surojit Biswas, Grigory Khimulya, Ethan C. Alley + 2 more
Nature Methods
A machine learning-guided paradigm that can use as few as 24 functionally assayed mutant sequences to build an accurate virtual fitness landscape and screen ten million sequences via in silico directed evolution is introduced.
Google Earth Engine for geo-big data applications: A meta-analysis and systematic review
1181 Citations 2020Haifa Tamiminia, Bahram Salehi, Masoud Mahdianpari + 3 more
ISPRS Journal of Photogrammetry and Remote Sensing
A meta-analysis investigation of recent peer-reviewed GEE articles focusing on several features, including data, sensor type, study area, spatial resolution, application, strategy, and analytical methods confirmed that GEE has and continues to make substantive progress on global challenges involving process of geo-big data.
Sentinel-1 SAR Backscatter Analysis Ready Data Preparation in Google Earth Engine
335 Citations 2021Adugna Mullissa, Andreas Vollrath, Christelle Odongo-Braun + 5 more
Remote Sensing
A framework for preparing Sentinel-1 SAR backscatter Analysis-Ready-Data in the Google Earth engine that combines existing and new Google Earth Engine implementations for additional border noise correction, speckle filtering and radiometric terrain normalization is presented.
Extracting accurate materials data from research papers with conversational language models and prompt engineering
249 Citations 2024Maciej P. Polak, Dane Morgan
Nature Communications
This work proposes the ChatExtract method, a method that can fully automate very accurate data extraction with minimal initial effort and background, using an advanced conversational LLM, and shows that approaches similar to ChatExtract are likely to become powerful tools for data extraction in the near future.
Sustainable industrial and operation engineering trends and challenges Toward Industry 4.0: a data driven analysis
316 Citations 2021Ming‐Lang Tseng, Thi Phuong Thuy Tran, Hiền Minh Hà + 2 more
Journal of Industrial and Production Engineering
This study supplies contributions to the existing literature with a state-of-the-art bibliometric review of sustainable industrial and operation engineering as the field moves toward Industry 4.0, and guidance for future studies and practical achievements. Although industrial and operation engineering is being promoted forward to sustainability, the systematization of the knowledge that forms firms’ manufacturing and operations and encompasses their wide concepts and abundant complementary elements is still absent. This study aims to analyze contemporary sustainable industrial and operations e...
Google Earth Engine Cloud Computing Platform for Remote Sensing Big Data Applications: A Comprehensive Review
1035 Citations 2020Meisam Amani, Arsalan Ghorbanian, Seyed Ali Ahmadi + 9 more
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
This study aims to comprehensively explore different aspects of the GEE platform, including its datasets, functions, advantages/limitations, and various applications, and observed that Landsat and Sentinel datasets were extensively utilized by GEE users.
A XGBoost-Based Lane Change Prediction on Time Series Data Using Feature Engineering for Autopilot Vehicles
103 Citations 2022Yi Zhang, Xiupeng Shi, Sheng Zhang + 1 more
IEEE Transactions on Intelligent Transportation Systems
A lane change prediction framework for feature learning, with the aim to have a deep and comprehensive understanding of lane change behaviors, and reach a high performance based on the selected features.
Rapid and robust monitoring of flood events using Sentinel-1 and Landsat data on the Google Earth Engine
459 Citations 2020Ben DeVries, Chengquan Huang, John Armston + 3 more
Remote Sensing of Environment
An algorithm is presented that exploits all available Sentinel-1 SAR images in combination with historical Landsat and other auxiliary data sources hosted on the GEE to rapidly map surface inundation during flood events, relying on multi-temporal SAR statistics to identify unexpected floods in near real-time.
Combining machine learning and process engineering physics towards enhanced accuracy and explainability of data-driven models
177 Citations 2020Timur Bikmukhametov, Johannes Jäschke
Computers & Chemical Engineering
By adding physics-based models to machine learning, it is possible not only to improve the performance of the purely black-box machine learning models, but also to make them more transparent and interpretable.
Aircraft engine remaining useful life estimation via a double attention-based data-driven architecture
235 Citations 2022Lu Liu, Xiao Song, Zhetao Zhou
Reliability Engineering & System Safety
Remaining useful life (RUL) estimation has been intensively studied, given its important role in prognostics and health management (PHM) of industry. Recently, data-driven structures such as convolutional neural networks (CNNs), have achieved outstanding RUL prediction performance. However, conventional CNNs do not include an adequate mechanism for adaptively weighing input features. In this paper, we propose a double attention-based data-driven framework for aircraft engine RUL prognostics. Specifically, a channel attention-based CNN was utilized to apply greater weights to more significant f...
Engineering Is Elementary: An Engineering And Technology Curriculum For Children
162 Citations 2020Kate Hester, Christine M. Cunningham
journal unavailable
As our society becomes increasingly dependent on engineering and technology, it is more important than ever that everyone have a basic understanding of what engineers do, and the uses and implications of the technologies they create.Yet few citizens are technologically literate, in large part because technology and engineering are not taught in our schools 1 .Just as it is important to begin science instruction in the elementary grades by building on children's curiosity about the natural world, it's important to begin engineering instruction in elementary school by building on children's natu...
Atomically engineered, high-speed non-volatile flash memory device exhibiting multibit data storage operations
116 Citations 2023Ghulam Dastgeer, Sobia Nisar, Aamir Rasheed + 6 more
Nano Energy
Non-volatile memory devices, which offer large capacity and mechanical dependability as a mainstream technology, have played a key role in fostering innovation in modern electronics. Despite the advantages of non-volatile memory devices, their low ON/OFF ratio and slow operational speed have limited their performance compared to their volatile counterparts. In this study, we present a non-volatile floating-gate memory device based on van der Waals heterostructures , which exhibits ultrahigh-speed memory operations in the range of a hundred nanoseconds with an extinction ratio of up to 10 6 . T...
Dynamic predictive maintenance for multiple components using data-driven probabilistic RUL prognostics: The case of turbofan engines
108 Citations 2023Mihaela Mitici, Ingeborg de Pater, Anne Barros + 1 more
Reliability Engineering & System Safety
The increasing availability of condition-monitoring data for components/systems has incentivized the development of data-driven Remaining Useful Life (RUL) prognostics in the past years. However, most studies focus on point RUL prognostics, with limited insights into the uncertainty associated with these estimates. This limits the applicability of such RUL prognostics to maintenance planning, which is per definition a stochastic problem. In this paper, we therefore develop probabilistic RUL prognostics using Convolutional Neural Networks. These prognostics are further integrated into maintenan...
Abstract NOTE: The first page of text has been automatically extracted and included below in lieu of an abstract Session 2480 Illuminating Engineering Laura J. Bottomley and Elizabeth A. Parry North Carolina State University/Science Surround Abstract Engineering is a difficult profession to explain to the average person, much less student, and is probably one of the most frequently misunderstood. The session described in this paper was developed to put engineering in common terms for the lay person, as well as provide an interesting and fun way to explore different concentration areas of the p...
Aerodynamics for Engineers
350 Citations 2025John J. Bertin, Mike L. Smith
Cambridge University Press eBooks
Revised and expanded to reflect cutting-edge innovation in aerodynamics, and packed with new features to support learning, the seventh edition of this classic textbook introduces the fundamentals of aerodynamics using clear explanations and real-world examples. Structured around clear learning objectives, this is the ideal textbook for undergraduate students in aerospace engineering, and for graduate students and professional engineers seeking a readable and accessible reference. Over 10 new Aerodynamics Computation boxes that bring students up to speed on modern computational approaches for p...
Aerodynamics for Engineers
174 Citations 2021John J. Bertin, Russell M. Cummings
Cambridge University Press eBooks
Now reissued by Cambridge University Press, this sixth edition covers the fundamentals of aerodynamics using clear explanations and real-world examples. Aerodynamics concept boxes throughout showcase real-world applications, chapter objectives provide readers with a better understanding of the goal of each chapter and highlight the key 'take-home' concepts, and example problems aid understanding of how to apply core concepts. Coverage also includes the importance of aerodynamics to aircraft performance, applications of potential flow theory to aerodynamics, high-lift military airfoils, subsoni...
Combustion Engineering
141 Citations 2022Kenneth M. Bryden, Kenneth W. Ragland, Song‐Charng Kong
journal unavailable
Combustion Engineering, Third Edition introduces the analysis, design, and building of combustion energy systems. It discusses current global energy, climate, and air pollution challenges and considers the increasing importance of renewable energy sources, such as biomass fuels. Mathematical methods are presented, along with qualitative descriptions of their use, which are supported by numerous tables with practical data and formulae, worked examples, chapter-end problems, and updated references. The new edition features new and updated sections on solid biofuels, spark-ignition, compression-i...
It is argued that many limitations of traditional organoid culture can be addressed by engineering approaches at all levels of organoid systems, and engineering approaches, including cellular engineering, designer matrices and microfluidics, are investigated to improve the reproducibility and physiological relevance of organoids.
The textbook's 4th edition features 600 new and revised end-of-chapter problems, applications and include new topics such as energy harvesting and renewable energy and a host of online videos and experiments, all directed to upper undergraduates, intended for classes in electromagnetics.
Designed to cover the fundamental concepts of thermodynamics used in engineering, the book introduces topics such as the laws of thermodynamics, exergy analysis, thermodynamic cycles, measurement theory, and applications. Using step by step examples and numerous illustrations, the book is designed with a self-teaching methodology, including a variety of exercises with corresponding answers to enhance mastery of the content. The book provides an engineer with a basic understanding or review of thermodynamic principles. Features: Designed to cover the fundamental concepts of thermodynamics used ...
Contents Preface 1 Introduction 2 Vector Algebra 3 Vector Calculus 4 Electrostatics 5 Magnetostatics 6 Maxwell's Equations for Time-Varying Fields 7 Plane-Wave Propagation 8 Transmission Lines 9 Wave Reflection and Transmission 10 Radiation and Antennas
Engineered probiotics
117 Citations 2022Junheng Ma, Yuhong Lyu, Xin Liu + 5 more
Microbial Cell Factories
The theoretical basis of gene editing technology is introduced and some recent engineered probiotics researches, including inflammatory bowel disease, bacterial infection, tumor and metabolic diseases are focused on.
Machine learning data-driven approaches for land use/cover mapping and trend analysis using Google Earth Engine
171 Citations 2021Bakhtiar Feizizadeh, Davoud Omarzadeh, Mohammad Kazemi Garajeh + 2 more
Journal of Environmental Planning and Management
This study utilizes machine learning algorithms on the GEE cloud computing platform for land use/land cover (LULC) mapping and change detection analysis using a Landsat satellite image time series and confirms the potential of machine learning techniques for time series LULC mapping on theGEE platform while lowering the barriers to analyzing large amounts of satellite data.
Analisis Mutu Data Time Series Covid-19: Studi kasus di Covid-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University
116 Citations 2020Dhika Surya Pangestu, Yuyun Hidayat
journal unavailable
Covid-19 adalah penyakit menular yang disebabkan oleh SARS-CoV-2, yang merupakan salah satu jenis dari koronavirus. Sejak awal kemunculannya pada akhir tahun 2019, hingga 2 Agustus 2020 telah ada lebih dari 17,7 juta penduduk dunia yang terinfeksi. Dalam selang waktu itu muncul berbagai penelitian untuk mempelajari pandemi covid-19 ini dan salah satunya adalah penelitian mengenai perkembangan jumlah kasus covid-19. Salah satu dari sekian banyak dataset yang digunakan dalam mempelajari perkembangan jumlah kasus covid-19 adalah data dari COVID-19 Data Repository by the Center for Systems Science...
Engineering Multi‐Cellular Spheroids for Tissue Engineering and Regenerative Medicine
220 Citations 2020Se‐Jeong Kim, Eun Mi Kim, Masaya Yamamoto + 2 more
Advanced Healthcare Materials
Abstract Multi‐cellular spheroids are formed as a 3D structure with dense cell–cell/cell–extracellular matrix interactions, and thus, have been widely utilized as implantable therapeutics and various ex vivo tissue models in tissue engineering. In principle, spheroid culture methods maximize cell–cell cohesion and induce spontaneous cellular assembly while minimizing cellular interactions with substrates by using physical forces such as gravitational or centrifugal forces, protein‐repellant biomaterials, and micro‐structured surfaces. In addition, biofunctional materials including magnetic nan...
Mapping the Land Cover of Africa at 10 m Resolution from Multi-Source Remote Sensing Data with Google Earth Engine
101 Citations 2020Qingyu Li, Chunping Qiu, Lei Ma + 2 more
Remote Sensing
After experimental evaluation of different land cover classes across different cities, it is concluded that continental land cover mapping results can be considerably improved when training samples of natural land cover Classes are collected and combined from areas covering each Köppen climate zone.
Monitoring Forest Change in the Amazon Using Multi-Temporal Remote Sensing Data and Machine Learning Classification on Google Earth Engine
107 Citations 2020Maria Antonia Brovelli, Yaru Sun, Vasil Yordanov
ISPRS International Journal of Geo-Information
The results demonstrate that such a fusion of satellite observations, machine learning, and cloud processing, benefits the analysis of the forest dynamics and can provide useful information for the development of forest policies.
Mapping Three Decades of Changes in the Brazilian Savanna Native Vegetation Using Landsat Data Processed in the Google Earth Engine Platform
224 Citations 2020Ane Alencar, Julia Z. Shimbo, Felipe Lenti + 13 more
Remote Sensing
These results were fundamental in indicating areas with higher rates of change in a long time series in the Brazilian Cerrado and to highlight the challenges of mapping distinct NV types in a highly seasonal and heterogeneous savanna biome.
When machine learning engineers work with data sets, they may find the results aren't as good as they need. Instead of improving the model or collecting more data, they can use the feature engineering process to help improve results by modifying the data's features to better capture the nature of the problem. This practical guide to feature engineering is an essential addition to any data scientist's or machine learning engineer's toolbox, providing new ideas on how to improve the performance of a machine learning solution. Beginning with the basic concepts and techniques, the text builds up t...
The book presents concepts and equations of equilibrium thermodynamics or thermostatics. Key features that distinguish this book from others on chemical engineering thermodynamics are: a mathematical treatment of the developments leading to the discovery of the internal energy and entropy; a clear distinction between the classical thermodynamics of Carnot, Clausius and Kelvin and the thermostatics of Gibbs; an intensive/specific variable formalism from which the extensive variable formalism is obtained as a special case; a systematic method of obtaining the central equations of thermostatics w...
This chapter discusses the development of financial models for planning and execution of wind projects, as well as the impact of environmental impact and other aspects of the business.
Principles of Tissue Engineering
218 Citations 2020Kanczler, Janos M., Wells, Julia Anne, Gibbs, David + 3 more
Elsevier eBooks
This research highlights the need to understand more fully the role of “cell reprograming” in the development of Parkinson’s disease.
Engineered Living Hydrogels
190 Citations 2022Xinyue Liu, María Eugenia Inda, Yong Lai + 2 more
Advanced Materials
Abstract Living biological systems, ranging from single cells to whole organisms, can sense, process information, and actuate in response to changing environmental conditions. Inspired by living biological systems, engineered living cells and nonliving matrices are brought together, which gives rise to the technology of engineered living materials. By designing the functionalities of living cells and the structures of nonliving matrices, engineered living materials can be created to detect variability in the surrounding environment and to adjust their functions accordingly, thereby enabling ap...
This book considers requirements engineering as a combination of three concurrent and interacting processes: eliciting knowledge related to a problem domain, ensuring the validity of such knowledge and specifying the problem in a formal way.
Phase engineering of nanomaterials
708 Citations 2020Ye Chen, Zhuangchai Lai, Xiao Zhang + 4 more
Nature Reviews Chemistry
Phase has emerged as an important structural parameter - in addition to composition, morphology, architecture, facet, size and dimensionality - that determines the properties and functionalities of nanomaterials. In particular, unconventional phases in nanomaterials that are unattainable in the bulk state can potentially endow nanomaterials with intriguing properties and innovative applications. Great progress has been made in the phase engineering of nanomaterials (PEN), including synthesis of nanomaterials with unconventional phases and phase transformation of nanomaterials. This Review prov...
Engineering cytokine therapeutics
173 Citations 2023Jeroen Deckers, Tom Anbergen, Ayla M. Hokke + 13 more
Nature Reviews Bioengineering
How the development of bioanalytical methods, such as sequencing and high-resolution imaging combined with genetic techniques, have facilitated a better understanding of cytokine biology are discussed, and bioengineering approaches for the design of clinically applicable and safe cytokine-based therapeutics are highlighted.