Home / Papers / Confr - A Configuration System for Machine Learning Projects

Confr - A Configuration System for Machine Learning Projects

1 Citations2022
M. Arro
journal unavailable

The design and usage of confr is outlined, a concise and flexible configuration system geared towards Python-based machine learning projects that combines some of the capabilities of commonly used systems into a library which aims to reduce repetitive code and maintenance overhead.

Abstract

Finding a performant machine learning model usually requires exploring different combinations of model hyperparameters, preprocessing steps, data generation and train logic. To facilitate a clear analysis of the factors that determine accuracy, it is useful to make the data processing and train pipeline highly configurable such that a combination of a code version and configuration file uniquely determines the behaviour of the system. A poor configuration system can lead to repetitive code that is hard to maintain, understand, and brittle due to insufficient configuration validation logic. This paper outlines the design and usage of confr, a concise and flexible configuration system geared towards Python-based machine learning projects. It combines some of the capabilities of commonly used systems (such as gin-config, OmegaConf, and Hydra) into a library which aims to reduce repetitive code and maintenance overhead. It can be used both as part of a notebook-based and script-based workloads, and can be used for ensuring that there is no accidental difference between inference-time and train-time behaviour.