Couchbase Cluster Failover Architecture -

i’m referring couchbase server in application stack section of this document, outlining desired architecture of couchbase cluster.

i notice each of 5 couchbase nodes in diagram have corresponding web server. aware couchbase sdks designed establish connection single node, , retain connection requests, exception of failover events.

firstly, want confirm each of 5 web servers in diagram establish single connection 1 of 5 couchbase nodes. assume 1:1 relationship result; each web server connect corresponding couchbase node, such no 2 web servers establish connections same couchbase node.

if case, in event of couchbase node-failure, assuming node unrecoverable, should remove corresponding web server? may seem unintuitive, here situation understand it:

couchbase node #1 dies
web server #1 (connected couchbase node #1) establishes connection next available node, couchbase node #2 (most sdks handle this, faia)
couchbase node #2 has 2 established connections; web server #2 (its corresponding server) , web server #1 (whose corresponding couchbase node dead)

my concern have noticed ephemeral port exhaustion issues couchbase server, when establishing more 1 connection single node. this results in client timeouts:

get http://0.0.0.0:8091/pools: dial tcp 0.0.0.0:8091: operation timed out

again, based on this, should remove corresponding web server when couchbase node dies, avoid multiple connections same couchbase node, , potential ephemeral port exhaustion?

there not 1:1 relationship between web server , couchbase node. each web server has connections each couchbase node. in couchbase each node of cluster has percentage of entire data set active, not full copy. couchbase automatically shards data , these shards (vbuckets) spread evenly across entire cluster.

so when web server or app server going read/write object, go corresponding node in cluster owns vbucket object lives. in couchbase sdks there consistent hash takes each object's id , output of hash number between 1 , 1024. there 1024 active vbuckets , each replica has 1024. output of consistent has vbucket id object live in. make sense? sdk looks in copy of cluster map (which updated time there cluster topology change) node of cluster shard lives on , goes interact specific node directly object.

so failure scenario not quite correct. if node of couchbase cluster fails, vbuckets on node unavailable. percentage of entire data set. if have auto-failure turned on (off default) after timeout set in cluster, cluster automatically fail out node timing out , promote replica vbuckets active, getting 100% active data set. cluster sacrifices replica vbuckets basically. since topology change, new cluster map sent client applications sdks , live moved on. also, need rebalance of cluster regenerate missing replica vbuckets , normal.

as ephemeral port exhaustion, how managing connections cluster? reusing connections or opening new connections each time , not closing them? want opening connections , reusing them, not keep opening new ones on , over. if open new ones each time , not clean up, exhaust port file descriptors. said, reuse them.

WIKI

Search This Blog

Couchbase Cluster Failover Architecture -

Comments

Post a Comment