Apache Kafka Overview (Windows)

Updated: Aug 22, 2020

Apache Kafka is middleware solution for enterprise application. It was initiated by LinkedIn lead by Neha Narkhede and Jun Rao. Initially it was designed for monitoring and tracking system, later on it became part of one of the leading project of Apache.

Why Use Kafka?

  • Multiple producers

  • Multiple consumers

  • Disk based persistence

  • Highly scalable

  • High performance

  • Offline messaging

  • Messaging replay

Kafka Use Cases

1. Enterprise messaging system

  • Kafka has topic based implementation for message system. One or more consumers can consume the message and commit as per application need.

  • Suitable for both online and offline messaging consumer system.

2. Message Store with playback capability

  • Kafka provides the message retention on the topic. Retention of the message can be configured for the specified duration.

  • Each message is backed up with distributed file system.

  • Supports the storage size for 50K to 50 TB.

3. Stream processing

  • Kafka is capable enough to process the message in real time in batch mode or in message wise. it provides the aggregation of message processing for specified time window.

Download and Install Kafka

Kafka requires below JRE and Zookeeper. Download and Install the below components.

  1. JRE : http://www.oracle.com/technetwork/java/javase/downloads/jre8-downloads-2133155.html

  2. ZooKeeper : http://zookeeper.apache.org/releases.html

  3. Kafka : http://kafka.apache.org/downloads.html

Installation (on Windows)

1. JDK Setup

  1. Set the JAVA_HOME under system environment variables from the path Control Panel -> System -> Advanced system settings -> Environment Variables.

  2. Search for a PATH variable in the “System Variable” section in “Environment Variables” dialogue box you just opened.

  3. Edit the PATH variable and append “;%JAVA_HOME%\bin”

  4. To confirm the Java installation just open cmd and type “java –version”, you should be able to see version of the java you just installed

2. Zookeeper Installation:

  1. Goto your Zookeeper config directory. It would be zookeeper home directory (i.e: c:\zookeeper-3.4.10\conf)

  2. Rename file "zoo_sample.cfg" to "zoo.cfg".

  3. Open zoo.cfg in any text editor and Find & edit dataDir=/tmp/zookeeper to :\zookeeper-3.4.10\data.

  4. Add entry in System Environment Variables as we did for Java.

  5. Add in System Variables ZOOKEEPER_HOME = C:\zookeeper-3.4.10

  6. Edit System Variable named "PATH" and append ;%ZOOKEEPER_HOME%\bin;

  7. You can change the default Zookeeper port in zoo.cfg file (Default port 2181).

  8. Run Zookeeper by opening a new cmd and type zkserver.

3. Kafka Setup:

  1. Go to your Kafka config directory. For me its C:\kafka_2.10-\config.

  2. Edit file "server.properties" and Find & edit line "log.dirs=/tmp/kafka-logs" to "log.dir= C:\kafka_2.10-\kafka-logs".

  3. If your Zookeeper is running on some other machine or cluster you can edit " zookeeper.connect=localhost:2181" to your custom IP and port.

  4. Goto kafka installation folder and type below command from a command line. \bin\windows\kafka-server-start.bat .\config\server.properties.

  5. Your Kafka will run on default port 9092 & connect to zookeeper’s default port which is 2181.

Testing Kafka

Creating Topics

  • Now create a topic with name “test.topic” with replication factor 1, in case one Kafka server is running(standalone setup).

  • If you have a cluster with more than 1 Kafka server running, you can increase the replication-factor accordingly which will increase the data availability and act like a fault-tolerant system.

  • Open a new command prompt in the location C:\kafka_2.11-\bin\windows and type following command and hit enter.

kafka-topics.bat --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test.topic

Creating a Producer

  • · Open a new command prompt in the location C:\kafka_2.11-\bin\windows.