we trying build multi tenancy on top of hadoop ecosystem.
our ecosystem typically comprise of hadoop components hdfs, yarn, hive, oozie, zookeeper.
till now, have looked onto concepts like
it federates distributed storage(hdfs) of seperate namenode each federated hdfs partition.
problem: have 2 tenants single cluster hence 2 namenodes, 2 namenodes imply 2 hive servers, 2 oozie servers , 2 of each of other hadoop components communicate respective namenode , write in respective hdfs partition.
compute quotas enforced each tenant.(say tenant 1 - 50% , tenant 2 - 50%)
i not find how can develop similar multi tenant environment hadoop cluster.
what thinking each tenant(with users) have
- compute quota(through capacity scheduler)
- hdfs directory each tenant in hdfs (like /usr/tenant1, /usr/tenant2)
all users belonging tenant1 group have access write in hdfs directory(/usr/tenant1/username)
what problem faced was.. create table hive tenant1 user, created table /apps/hive/warehouse /apps/oozie/data. thinking create table in user's hdfs home directory , hence tenant1 users have access it. didn't happen.
can in how should proceed development create multi tenancy on hadoop ecosystem?
you can implement multiple namespace in extent satisfy requirements.
i request have on apache blog on namespace.
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/federation.html
you can implement multiple namespaces within namenode.
kind regards andrew
Comments
Post a Comment