i'm trying move away sqs rabbitmq messaging service. i'm looking build stable high availability queuing service. i'm going cluster.
current implementation , have 3 ec2 machines rabbitmq management plugin installed in ami , , explicitly go each of machine , add
sudo rabbitmqctl join_cluster rabbit@<hostnameofparentmachine> with ha property set , synchronization works. , load balancer on top dns assigned. far thing works.
expected implementation: create autoscaling clustered environment machines go up/down has join/remove cluster dynamically. best way achieve this? please help.
i had similar configuration 2 years ago.
i decided use amazon vpc, default design had 2 rabbitmq instances running, , configured in cluster (called master-nodes). rabbitmq cluster behind internal amazon load balancer.
i created ami rabbitmq , management plug-in configured (called “master-ami”), , configured autoscaling rules.
if autoscaling alarm raised new master-ami launched. ami executes follow script first time executed:
#!/usr/bin/env python import json import urllib2,base64 if __name__ == '__main__': prefix ='' subprocess import call call(["rabbitmqctl", "stop_app"]) call(["rabbitmqctl", "reset"]) try: _url = 'http://internal-myloadbalamcer-xxx.com:15672/api/nodes' print prefix + 'get json info ..' + _url request = urllib2.request(_url) base64string = base64.encodestring('%s:%s' % ('guest', 'guest')).replace('\n', '') request.add_header("authorization", "basic %s" % base64string) data = json.load(urllib2.urlopen(request)) ##if script got error here can assume it's first machine , ## exit without controll error. remember add new machine balancer print prefix + 'request ok... finding running node' r in data: if r.get('running'): print prefix + 'found running node bind..' print prefix + 'node name: '+ r.get('name') +'- running:' + str(r.get('running')) subprocess import call call(["rabbitmqctl", "join_cluster",r.get('name')]) break; pass except exception, e: print prefix + 'error during add node' finally: subprocess import call call(["rabbitmqctl", "start_app"]) pass the scripts uses http api “http://internal-myloadbalamcer-xxx.com:15672/api/nodes” find nodes, choose 1 , binds new ami cluster.
as ha policy decided use this:
rabbitmqctl set_policy ha-two "^two\." ^ "{""ha-mode"":""exactly"",""ha-params"":2,"ha-sync-mode":"automatic"}" well, join “quite” easy, problem decide when can remove node cluster.
you can’t remove node based on autoscaling rule, because can have messages queues have consume.
i decided execute script periodically running 2 master-node instances that:
- checks messages count through api http://node:15672/api/queues
- if messages count queue zero, can remove instance load balancer , rabbitmq cluster.
this broadly did, hope helps.
Comments
Post a Comment