login
Home / Papers / Oblivious Sketching for Logistic Regression

Oblivious Sketching for Logistic Regression

16 Citations•2021•
Alexander Munteanu, Simon Omlor, David P. Woodruff
ArXiv

The first data oblivious sketch for logistic regression is presented, which can be computed in input sparsity time over a turnstile data stream and reduces the size of a $d-dimensional data set from $n$ to only $\operatorname{poly}(\mu d\log n)$ weighted points, where $\mu$ is a useful parameter which captures the complexity of compressing the data.

Abstract

What guarantees are possible for solving logistic regression in one pass over a data stream? To answer this question, we present the first data oblivious sketch for logistic regression. Our sketch can be computed in input sparsity time over a turnstile data stream and reduces the size of a $d$-dimensional data set from $n$ to only $\operatorname{poly}(\mu d\log n)$ weighted points, where $\mu$ is a useful parameter which captures the complexity of compressing the data. Solving (weighted) logistic regression on the sketch gives an $O(\log n)$-approximation to the original problem on the full data set. We also show how to obtain an $O(1)$-approximation with slight modifications. Our sketches are fast, simple, easy to implement, and our experiments demonstrate their practicality.