Getting started with Apache Kafka Series – What is Kafka – #1

WHAT IS KAFKA

Apache Kafka is built on the concept of publish-subscribe messaging model. It was developed at LinkedIn and later handed over to Apache Open Source Projects.Unlike others messaging queues like RabbitMQ and ActiveMQ, it has inbuilt streaming capabilities as well.

 

 

Real Life Example of RabbitMQ Model

  • There is a boy who delivers newspapers every day. Now, consider him as the BROKER and the newspaper agency as the PRODUCER and people as the CONSUMER.

In this case the broker has to be sharp; the delivery boy needs to know consumer requirements like which customer prefer Hindi news or which one prefers a different newspaper brand.

  • This means the broker needs to be smart and should know which messages have to be delivered to which consumer.
  • This leads to less efficient model. Slow in processing huge messages.
  • It can process 15-20k messages per second.

Real Life Example of Kafka Model

  • In a Campus Placement, results are declared and copy of the sheet is pasted on the Notice board.  Now, consider that sheet as the Broker which has all the required information in it.

In this case, the students from different colleges with different branch-departments are consumer.  Now, the students have to be smart enough to find their exam results based on considering filters like College Name and their Father’s name.

  • Once the producer produces the messages on the broker, they are subscribed by the smart consumer.
  • This leads to high efficient model. Fast in processing huge messages. Thus, a perfect component for big data ecosystem.
  • It can process 80-100k messages per second.

 

Concepts of kafka

TOPICS – The place wherein the Stream of records are stored by the Kafka Cluster.

There are  mainly four APIs –

  1. Producer API  – It allows a producer or an application to publish messages or stream of records to topics in Kafka.
  2. Consumer API  – It allows a consumer or application to subscribe to topics and process the stream of records produced.
  3. Streams API– It behaves as an application that can consume an input stream from a topic and produce an output stream to a topic.
  4. Connector API – It helps in developing producers and consumers that can be reused as a component again and again. It can connect kafka topic to a RDBMS system.