I. About kafka
Kafka is a high-throughput distributed publish-subscribe messaging system that can handle the size of the consumer action website all stream data. This action (web browsing, search and other user's action) is a key factor in many social functions in modern networks. These data are usually due to the throughput requirements be resolved through logs and log aggregation. Like for like Hadoop logs and offline data analysis systems, but requires real-time processing limitations, this is a viable solution. Kafka's purpose is Hadoop parallel loading mechanism to unify both online and offline messaging, but also to the cluster through the machine to provide real-time consumption.
1. I Configure each host IP. Each host IP configured as static IP (ensure that each host can communicate properly, to avoid excessive network traffic, it is recommended in the same network segment)
2. Modify the host name of the machine. Kafka cluster all the hosts need to be modified.
3. Configure host mapping. Modify the hosts file, add the mapping for each host IP and host name.
4. Open the appropriate ports. Port configuration later in this document require open (or turn off the firewall), root privileges.
5. Zookeeper ensure the cluster service to work. In fact, as long as the Zookeeper cluster deployment is successful, the preparatory work can be done above the basic.
III. Installing Kafka
1. Kafka download the installation package, visit Kafka's official website to download the corresponding version. As used herein version 2.9.2-0.8.1.1.
2. use the following command to extract the installation package
tar -zxvf kafka_2.9.2-0.8.1.1.tgz
3. Modify the configuration file, only you need to modify /config/server.properties simple configuration file.
vim config / server.properties
needs to be modified:
broker.id (labeled current server id in the cluster, starting from 0); port; host.name (current server host name); zookeeper.connect (connected zookeeper cluster); log.dirs (log in storage directory, you need to create in advance).
4. Kafka configured to upload to other nodes
scp -r kafka node2: / usr /
Note, do not forget to upload after each node and modify broker.id host.nam and other unique configurations.
IV. Start and test Kafka
1. First start Zookeeper, then use the following command to start Kafka, a message indicates that after the successful launch.
./ bin / kafka-server-start.sh config / server.properties &
2. On Kafka test. Create separate topic, producer, consumer, preferably created on different nodes. Enter information on the producer of the console, the console is able to observe the consumer received.
Create a topic:
./ bin / kafka-topics.sh -zookeeper node1: 2181, node2: 2181, node3: 2181 -topic test -replication-factor 2 -partitions 3 -create
./ bin / kafka-topics.sh -zookeeper node1: 2181, node2: 2181, node3: 2181 -list
./ bin / kafka-console-producer.sh -broker-list node1: 9092, node2: 9092, node3: 9092 -topic test
./ bin / kafka-console-consumer.sh -zookeeper node1: 2181, node2: 2181, node3: 2181 - from-begining -topic test
Enter information in the producer of the console to see whether the consumer receives the console.
After the above configuration and testing, Kafka has initially deployed, the next you can configure and operate according to the specific needs of Kafka. More about Kafka's operations and use more specific please refer to the document examiner network. https://cwiki.apache.org/confluence/display/KAFKA/Index