hive - How can we build multi tenancy on top of hadoop ecosystem? -


we trying build multi tenancy on top of hadoop ecosystem.

our ecosystem typically comprise of hadoop components hdfs, yarn, hive, oozie, zookeeper.

till now, have looked onto concepts like

  1. hdfs federation

it federates distributed storage(hdfs) of seperate namenode each federated hdfs partition.

problem: have 2 tenants single cluster hence 2 namenodes, 2 namenodes imply 2 hive servers, 2 oozie servers , 2 of each of other hadoop components communicate respective namenode , write in respective hdfs partition.

  1. capacity scheduler

compute quotas enforced each tenant.(say tenant 1 - 50% , tenant 2 - 50%)

  1. mapr multi tenancy features

i not find how can develop similar multi tenant environment hadoop cluster.

what thinking each tenant(with users) have

  1. compute quota(through capacity scheduler)
  2. hdfs directory each tenant in hdfs (like /usr/tenant1, /usr/tenant2)

all users belonging tenant1 group have access write in hdfs directory(/usr/tenant1/username)

what problem faced was.. create table hive tenant1 user, created table /apps/hive/warehouse /apps/oozie/data. thinking create table in user's hdfs home directory , hence tenant1 users have access it. didn't happen.

can in how should proceed development create multi tenancy on hadoop ecosystem?

you can implement multiple namespace in extent satisfy requirements.

i request have on apache blog on namespace.

https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/federation.html

you can implement multiple namespaces within namenode.

kind regards andrew


Comments