Dive into a curated selection of the most influential research papers on Data Engineering. This collection covers groundbreaking approaches, methodologies, and applications that are shaping the future of this critical field. Expand your knowledge and keep up with the latest trends and innovations in Data Engineering.
Looking for research-backed answers?Try AI Search
Lorna M. Smith
journal unavailable
This guide to finding scientific and engineering raw data for analysis, comparison and research helps scientists and engineers find the best sources of data for their research.
Mike Tamir, Steven Miller, A. Gagliardi
Labor: Personnel Economics eJournal
Businesses are quickly realizing that data scientists can only go so far without the team in place to support their day-to-day work, but more importantly to operationalize their work.
Patrick Petersen, Hanno Stage, Jacob Langner + 4 more
2022 IEEE International Symposium on Systems Engineering (ISSE)
This paper aims to take a step towards the introduction of a data engineering process in data-driven automotive systems engineering by putting a spotlight on developing well-designed data sets as the central element for training and validating AI-based software.
余建兴
journal unavailable
The embodiment of the invention discloses a data screening engine establishing method and a data screenings engine to solve the technical problem of the current manual method that the blacklist and whitelist rule is hard to summarize and distinguish from a massive number of user behaviors.
Randall D. Tardy, Steve C. Brown, Mo Harmon + 1 more
Transportation Research Record
A consortium of DOTs, consultants, and vendors was formed in June 1998 to define, document, prototype, and disseminate standard engineering data formats, EAS-E, which will enable vendors to provide importing and exporting engines within their proprietary software products that will survive the test of time.
Ranjana Rajiv Gaikwad
Journal of Research in Science and Engineering
The purpose of the work is to consider such an important aspect when working with data as data security and confidentiality in Data Engineering.
Vishwanadham Mandala
International Journal of Scientific Research and Management (IJSRM)
The technical plan of the K-SAP is outlined and the experiences of the first two years are discussed, including data integration, transforming the data, and loading the data into a database so the data can be managed by data professionals, business analysts, and data scientists.
V. I. Nnebedum, Nigeria Kamalu
International Journal of Computer Applications
This paper is a tutorial presentation on data management and data engineering, focusing more on the critical issues that are relevant to both studies but justifying the recent paradigm move towards data engineering.
Jan-Micha Bodensohn, Ulf Brackmann, Liane Vogel + 3 more
journal unavailable
A recent line of work applies Large Language Models (LLMs) to data engineering tasks on tabular data, suggesting they can solve a broad spectrum of tasks with high accuracy. However, existing research primarily uses datasets based on tables from web sources such as Wikipedia, calling the applicability of LLMs for real-world enterprise data into question. In this paper, we perform a first analysis of LLMs for solving data engineering tasks on a real-world enterprise dataset. As an exemplary task, we apply recent LLMs to the task of column type annotation to study how the data characteristics af...
This work outlines three diverse applications to the economics of information; to life-cycle employment, earnings, and spending; and to public policy analysis and provides a general overview of the engineering process.
M. Drumond, Alexandros Daglis, Nooshin Mirzadeh + 5 more
2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA)
This thesis is that efficient NMP calls for an algorithm-hardware co-design that favors algorithms with sequential accesses to enable simple hardware that accesses memory in streams, and introduces an instance of such a co-designed NMP architecture for data analytics, the Mondrian Data Engine.
Peter Chen, Wesley Chu, Jane Liu + 1 more
journal unavailable
A novel multi-web platform voting framework that incorporates 4 sets of novel features: content, linguistic, similarity, and sentiments is proposed that is quite intelligent, gives promising results, and effectively predicts misleading information.
G. PrajithP, Minu Augustine, Student
journal unavailable
The Python-based pipeline not only provides an efficient and scalable solution for handling OLX data but also facilitates easy integration with popular analytics tools, empowering users to derive actionable insights for strategic decision-making.
This paper illustrates the existing problems in regulation implement,and proposes relevant suggestions from improving the quality of personnel,strengthening standard management of engineering data,and increasing the strength of monitoring in light of various nonstandard behaviors,which to offer important references to ensure and improve engineering quality.
This article provides an overview of the key technologies used to implement various types ofSearch engines, including document search engines, database search engine, document metasearch engines, and database metAsearch engines.
This article shows that open-source data sets are the rocket fuel for research and innovation at even some of the largest AI organizations, and analysis of nearly 2000 research publications from Facebook, Google and Microsoft over the past five years shows the widespread use and adoption of open data sets.
Michael Duck, Peter Bishop, Richard Reed + 1 more
journal unavailable
This book discusses in detail the lower layers of the OSI model and includes a thorough description of coding - source, channel and line.
Author points out that engineering data should authentic, accurate, in time and integrated to provide reliable references for class evaluation of construction quality.
The architecture and instruction set are based on earlier work done on a Pascal compiler for CS-138 class at Caltech during the winter of 1979, and features a microcoded control section.
Diego Calvanese, Avigdor Gal, D. Lanti + 3 more
journal unavailable
A comprehensive catalog of sophisticated mapping patterns that emerge when linking databases to ontologies is identified, building on well-established methodologies and patterns studied in data management, data analysis, and conceptual modeling.
M. Moore
Healthcare informatics : the business magazine for information and communication systems
Results improved management of data exchange among disparate systems and up-front studies of vendor offerings, potential problems, and long-term needs.
The importance of data abstraction in the specification, design and implementation of large systems raises the question as to whether such methods may be applied in the context of programming languages designed before the widespread use of abstraction techniques.
Sergio España, Chris van der Maaten, Jens Gulden + 1 more
journal unavailable
Information and communication technology (ICT) brings about numerous advantages across various domains of our lives. However, alongside these benefits, there is a growing awareness of its potential negative ethical, social, and environmental impacts. Consequently, stakeholders ranging from conceptual modellers to policy makers often find themselves grappling with ethical considerations stemming from ICT engineering and usage. This paper presents a review of 10 ethical reasoning methods suitable for the ICT domain. We have employed a method engineering technique to author metamodels for the met...
This paper describes the combination of visualization techniques with animation to visualize geometrical and temporal engineering data created as part of a two vehicle accident investigation. The exemplar case discussed involved recreating the collision between an airborne motorcycle having jumped a hill and a three wheeled all terrain vehicle approaching the hill from the side opposite that of the motorcycle. Potential visibility of the all terrain vehicle's safety flag was critical to a litigation. The paper discusses the visualization of the recreated collision as viewed from a number of va...
Vishwanadham Mandala
International Journal of Engineering and Computer Science
The government of Tegal city doesn’t currently have the media used to publicize land use, the socialization of land use is rarely done, so that this may cause some problems, and the Land Use Mapping System in Tegal City Area becomes a solution to the problems.
A general DRE template both as an activity model and as a data model to be populated with reverse engineered data is described, showing how DRE has been used to address organizational data integration problems.
Foreword - the Institution of Mechanical Engineers. Preface. Introduction - the Role of Technical Students. Web Sites: Quick References. Section 1:- Important Regulations and Directives. 1.1 The pressure equipment directive: SI 2001: 1999. 1.2 The pressure systems regulations: SI 128. 1.3 Health and Safety at work. 1.4 The European machinery directive. 1.5 Inspection terms and bodies: EN 45004: 1995. 1.6 UKAS accreditation marks. Section 2:- Units. 2.1 The Greek alphabet. 2.2 Units systems. 2.3 Units and conversions. 2.4 Consistency of units. 2.5 Dimensional analysis. 2.6 Essential engineering...
This article shows how, with a little data manipulation, things aren't always as they seem.
This document describes ASHRAE Design Conditions, which are defined as follows: Air to Air Heat Recovery, Humidification Systems, and Economizer System Savings.
溶媒抽出法により硝酸分解法湿式型ン酸の精製を行なうための基礎的研究の1部として, メチルイソブチルケトンーベンゼンー水の系について液液平衡を25°Cで測定した。この系の平衡データを三角図表に示し, また対応線データはOthmer-Tobiasプンットによって検討した。
An overview of the current research and development directions in knowledge and data engineering is provided, with respect to programmability and representation, design tradeoffs, algorithms and control, and emerging technologies.
M. Atkinson, Rob Baxter, P. Brezany + 5 more
journal unavailable
The architecture outlined is of a general nature and should be considered by practitioners across a wide range of (data-centric) specialities, for example the programming paradigms such as Hadoop, mapreduce, and related technologies.
A. Klee, L. Minton
Journal of the Technical Councils of ASCE
Solutions to two major data defects (autocorrelation and multicolinearity) commonly encountered in the use of multivariable linear regression analysis when applied to engineering data are described. A computer program, incorporating a new method for the determination of the ridge parameter, is introduced for their implementation. The methods described are then applied to a complex data set involving a demonstration of the use of densified refuse-derived fuel burned in an industrial spreader stoker boiler. The analysis shows that, had only the more conventional regression analysis been applied,...
Recovery and coherency-control protocols for fast inter-system page transfer and ne-granularity locking in a shared disks transaction environment .
An introduction to computer based data management is presented with an orientation toward the needs of engineering application and a link to familiar engineering applications of computing is established through a discussion of data structure and data access procedures.
R. Lukyanenko, V. Storey, Óscar Pastor
journal unavailable
This research proposes that the often-overlooked notion of ‘‘system’’ should be a separate, and core, conceptual modeling construct and argues for incorporating it and related concepts, such as emergence, into existing approaches to conceptual modeling.
This manual contains detailed information on the T0 vector microprocessor, including information required to build T0 into a system, instruction execution timings, and information on low level T0 software interfaces required for operating system support.
Detta examensarbete behandlar utveckling, implementering och testning av en ”Data Transfer Engine” (DTE) i nasta generations systemplattform, som Swedish Space Corporation ut ...
D. Cockshoot
Computing & Control Engineering Journal
John Brown E&C is the world's largest chemical plant engineering and construction company employing 11000 staff in a multitude of offices, implementing a strategy for the use of information systems by engineers and designers in each of its many worldwide company operations.
S. Yau
1987 IEEE Third International Conference on Data Engineering
The panelists will discuss various aspects of this subject and exchange their ideas and experiences in this area in order to develop a knowledge-based software environment so that artificial intelligence techniques can be applied to large-scale software development and maintenance.
J. Lehman
journal unavailable
The DQE methodology provides a proven means for designing and implementing a workable migration strategy that features incremental improvement in system operations, implementation of a working data repository, and, most important, restored quality in the data values maintained in database files.
P. Thomas
Proceedings of the American Institute of Electrical Engineers
THE Engineering Data Sub-committee was created by the present administration and has therefore no established precedents to indicate the proper scope of its activities. The appointment of the committee grew out of the work of the original High-Tension Transmission Committee of ten years ago, under the leadership of its chairman, Mr. Mershon, in collecting engineering data about high-tension transmission plants and the similar work of the High-Tension Transmission Committee last year undertaken at the suggestion of Mr. Mershon, then President. The general approval of this collection of high-ten...
H. Kopp, R. Trettau, B. Zolotar
journal unavailable
The Data Engineering System is a computer-based system that organizes technical data and provides automated mechanisms for storage, retrieval, and engineering analysis that is intended for the engineer who must operationally understand an existing or planned design or who desires to carry out additional technical analysis based on a particular design.
D. Widegren, S. Petit, B. Rousseau + 3 more
journal unavailable
The CERN Engineering and Equipment Data Management System (EDMS) is one of the largest and most complex data management systems of its kind, providing multi-level data verifications, an advanced queuing mechanism and batch processing of large amounts of import requests.
This paper completes two traditional data mining algorithms-parallel transformation of Apriori and PageRank based on the in-memory computing module of Spark and its several actions as well as conversion operators.
G. Liebchen, M. Shepperd
journal unavailable
The extent and types of techniques used to manage quality within software engineering data sets are assessed, and the community needs to consider the quality and appropriateness of the data set being utilised.
M. Holsheimer, F. Kwakkel
journal unavailable
The interaction between Data Surveyor and its DBMS backends is described using an extended relational algebra, the Data Cube Algebra, to encode the mining requests and a drill engine produces optimized code for several database back-ends.
Peter D. Turney
ArXiv
Manufacturing process data present special problems for feature engineering, since the data have multiple levels of granularity (detail, resolution).
Felix Heine, Carsten Kleiner, Arne Koschel + 1 more
journal unavailable
The goal of the system is to provide DWH projects with an easy and quickly deployable solution to assess data quality while still providing highest flexibility in the definition of the assessment rules.
Zhong Jun-jian
Journal Os Southern Institute of Metallurgy
The paper discussed the methods of data collecting and data processing in reverse engineer-ing and emphasized on discussing and analyzing different characters and solutions of data collection, data processing and surface reconstruction.