Oct 5, 2018

Textual description of firstImageUrl

Kafka Connect: Setup ElasticSearch Sink Connector to transfer Kafka topic data to ElasticSearch in distributed mode

Elasticsearch is a distributed search and analytics engine which runs on Apache Lucene (The indexing and search library for high performance, full text search engine).
In previous posts we used Kafka Source Connectors (FileSourceConnector in standalone and Distributed mode & TwitterSourceConnector) - DataSource to KafkaTopic. In this post we will use ElasticSearchSink to transfer data from Kafka topic to ElasticSearch. 

Prerequisite:

Setup ElasticSearchSink connector
1. Start Docker.
2. Start Kafka Cluster and elastic search on docker(Landoop fast-data-dev) using docker-compose
➜  Kafka-connect docker-compose up kafka-cluster elsticsearch 
Creating network "code_default" with the default driver
Pulling kafka-cluster (landoop/fast-data-dev:cp3.3.0)...

3. ElasticSearch Connector config: Go to http://127.0.0.1:3030 and create ElasticSearch connector with following configurations.
name=sink-elastic-twitter-distributed
connector.class=io.confluent.connect.elasticsearch.ElasticsearchSinkConnector
tasks.max=2
topics=kafka-connect-distributed-twitter
key.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=true
value.converter=org.apache.kafka.connect.json.JsonConverter
value.converter.schemas.enable=true
connection.url=http://elasticsearch:9200
type.name=kafka-connect
key.ignore=true

4. Topology:  Source twitter distributed connected to Sink elastic twitter distributed
5. Monitor topic kafka-connect-distributed-twitter and count message at http://127.0.0.1:9200/kafka-connect-distributed-twitter/_count


Location: Bengaluru, Karnataka, India