Apache Kafka relation between partition and stream -


  1. what relation between partitions of topic , streams created using java api?

  2. what rationale behind having api like

     map<string, list<kafkastream<byte[], byte[]>>>  consumermap = _consumerconnector.createmessagestreams(topicvspartitioncountmap); 

    instead of having like

    list<kafkastream<byte[], byte[]>> consumerstreams = _consumerconnector.createmessagestreams(partitioncountfortopic); 

  1. it not map of topicname partitioncount rather number of streams like. each stream provides endless iterator , therefore consumes 1 thread intended use (you can combine 2 on 1 thread might create less streams begin with).

    the relationship partition 1 partition provides data same stream. therefore, within stream/thread provided same ordering guarantees modeled within kafka partition. so, if event happens before event b in partition x, and, partition x , partition y both stream stream 1, then, event guaranteed processed before event b if stream 1 processed in order.

  2. i think design decision chose (they identical, can achieve same both). other thing can think of every call createmessagestreams reaches out zookeeper , kafka set stream regardless of how many streams fetch , therefore there less overhead when fetch multiple in same call.

Comments