Streaming Loading

ByteHouse allows you to stream your data directly from Kafka and Kinesis

The Kafka loading job would be long-running and keep reading messages from your topic. ByteHouse Kafka streaming would make sure exactly once delivered, and your data would be immediately accessible when it is consumed. You can stop your job anytime to reduce resource usage, and resume it whenever necessary. ByteHouse would keep the offset internally to ensure no data was lost during the stop/resume process.

We support two types of the Kafka payload as of now:

  • JSON
  • Protobuf

Support Kafka/Confluent version: 0.10+

Create a streaming job

  1. To create a Kafka loading job, go to Import New Data and select Kafka
  1. You can select the saved connections. If no connection exists, you can create a new source connection. Currently, Kafka source can support 4 kinds of Authentication modes:
  • NONE
  • PLAIN (support SSL encryption)
  • SCRAM-SHA-256 (support SSL encryption)
  • SCRAM-SHA-256 (support SSL encryption)

You can put any easy-to-reference name in the Source name, and provide a list of broker addresses. You can separate the list of brokers by a comma ",". If your Kafka broker requires authentication, you can select Auth Mode and provide your credentials.

  1. Select a connection, and you can further choose the topic for the job to load. You can optionally create a consumer group for that topic. After that, you can specify the payload format.

  2. Analyze source topic schema

  • For JSON payload, you can use Analyze from Kafka feature by specifying the row delimiter.
  • For Protobuf payload, you can optionally upload a JSON-based schema file.
  1. Then, you can select a table for this topic to be loaded to. You can create a new table from the message schema for the first time.

  2. Next, you can name this job and add descriptions to it.

  1. Once a job is created, it is in the paused state. You can start to operate this job.

View streaming jobs

On the data import landing page, you will see all the import jobs of all types. You can filter a specific type like below:


Operating a streaming job

  1. To start a streaming job, go inside the stream job page, and click Start
  1. To stop the streaming job, go inside the stream job page, and click Pause
  1. All job histories will be saved on the job detail page