login
Home / Papers / Building a replicated logging system with Apache Kafka

Building a replicated logging system with Apache Kafka

157 Citations2015
Guozhang Wang, Joel Koshy, Sriram Subramanian

This abstract will talk about the design and engineering experience to replicate Kafka logs for various distributed data-driven systems at LinkedIn, including source-of-truth data storage and stream processing.

Abstract

<jats:p>Apache Kafka is a scalable publish-subscribe messaging system with its core architecture as a distributed commit log. It was originally built at LinkedIn as its centralized event pipelining platform for online data integration tasks. Over the past years developing and operating Kafka, we extend its log-structured architecture as a replicated logging backbone for much wider application scopes in the distributed environment. In this abstract, we will talk about our design and engineering experience to replicate Kafka logs for various distributed data-driven systems at LinkedIn, including source-of-truth data storage and stream processing.</jats:p>

Building a replicated logging system with Apache Kafka