login
Home / Papers / User-Defined Functions in Modern Data Engines

User-Defined Functions in Modern Data Engines

8 Citations•2023•
Ioannis Foufoulas, A. Simitsis
2023 IEEE 39th International Conference on Data Engineering (ICDE)

This tutorial presents recent advancements in the problem of efficient UDF execution in modern data engines, involving a broad scope of solutions ranging from algebraic, cost-based optimization to low level, physical query optimization, compilation, and execution.

Abstract

Modern data management applications involve complex processing tasks over large volumes of data. Although this falls naturally within the scope of relational databases, many such tasks cannot be expressed in SQL and require additional expressive power achieved via user-defined functions (UDFs). However, efficient processing of UDFs in data engines hinge on dealing with the impedance mismatch between UDF execution and SQL processing. In recent years, the problem of efficient UDF execution in modern data engines has gained significant traction. In this tutorial, we present recent advancements in this area, involving a broad scope of solutions ranging from algebraic, cost-based optimization to low level, physical query optimization, compilation, and execution. We also describe limitations and open issues, and discuss promising future research directions.