in producer-consumer web application, should thought process create partition key kinesis stream shard. suppose, have kinesis stream 16 shards, how many partition keys should create? dependent on number of shards?
partition (or hash) key: starts 1 340282366920938463463374607431768211455. lets ~34020 * 10^34, omit 10^34 ease...
if have 30 shards, uniformly divided, each should cover 1134 * 10^34 hash keys. coverage should this.
shard-00: 0 - 1134 shard-01: 1135 - 2268 shard-03: 2269 - 3402 shard-04: 3403 - 4536 ... shard-28: 30619 - 31752 shard-29: 31753 - 32886 shard-30: 32887 - 34020
and if have 3 consumer applications (listening these 30 shards) each should listen 10 shards (optimum balanced).
this explains merge , split operations on stream.
- to merge 2 shards, should cover adjacent hash keys. cannot merge shard-03 , shard-29.
- you can split shard. if split shard-00 in middle, distribution this;
shard-31: 0 - 567 shard-32: 568 - 1134 shard-01: 1135 - 2268 shard-03: 2269 - 3402 shard-04: 3403 - 4536 ... shard-28: 30619 - 31752 shard-29: 31753 - 32886 shard-30: 32887 - 34020
see, shard-00 no longer accept new data. new records put in kinesis stream same partition key range (as shard-00) placed under shard-31 or shard-32.
while sending data kinesis (ie. producer side), should not worry "which shard data goes to". sending random number (or uuid, or current timestamp in millis) best scaling , distributing data on shards. unless worried ordering of records in single shard, best choose random number/constantly changing partition key put_record request.
in java can use "putrecordsrequestentry.setpartitionkey(long.tostring(system.currenttimemillis()))" or "putrecordrequest.setpartitionkey(long.tostring(system.currenttimemillis()))" can examples.
Comments
Post a Comment