amazon web services - How to set up autoscaling RabbitMQ Cluster AWS -


i'm trying move away sqs rabbitmq messaging service. i'm looking build stable high availability queuing service. i'm going cluster.

current implementation , have 3 ec2 machines rabbitmq management plugin installed in ami , , explicitly go each of machine , add

sudo rabbitmqctl join_cluster rabbit@<hostnameofparentmachine> 

with ha property set , synchronization works. , load balancer on top dns assigned. far thing works.

expected implementation: create autoscaling clustered environment machines go up/down has join/remove cluster dynamically. best way achieve this? please help.

i had similar configuration 2 years ago.

i decided use amazon vpc, default design had 2 rabbitmq instances running, , configured in cluster (called master-nodes). rabbitmq cluster behind internal amazon load balancer.

i created ami rabbitmq , management plug-in configured (called “master-ami”), , configured autoscaling rules.

if autoscaling alarm raised new master-ami launched. ami executes follow script first time executed:

#!/usr/bin/env python import json import urllib2,base64  if __name__ == '__main__':     prefix =''     subprocess import call     call(["rabbitmqctl", "stop_app"])     call(["rabbitmqctl", "reset"])     try:         _url = 'http://internal-myloadbalamcer-xxx.com:15672/api/nodes'         print prefix + 'get json info ..' + _url         request = urllib2.request(_url)          base64string = base64.encodestring('%s:%s' % ('guest', 'guest')).replace('\n', '')         request.add_header("authorization", "basic %s" % base64string)         data = json.load(urllib2.urlopen(request))         ##if script got error here can assume it's first machine ,          ## exit without controll error. remember add new machine balancer         print prefix + 'request ok... finding running node'           r in data:             if r.get('running'):                 print prefix + 'found running node bind..'                 print prefix + 'node name: '+ r.get('name') +'- running:' + str(r.get('running'))                 subprocess import call                 call(["rabbitmqctl", "join_cluster",r.get('name')])                 break;         pass     except exception, e:         print prefix + 'error during add node'     finally:         subprocess import call         call(["rabbitmqctl", "start_app"])       pass 

the scripts uses http api “http://internal-myloadbalamcer-xxx.com:15672/api/nodes” find nodes, choose 1 , binds new ami cluster.

as ha policy decided use this:

rabbitmqctl set_policy ha-two "^two\." ^    "{""ha-mode"":""exactly"",""ha-params"":2,"ha-sync-mode":"automatic"}" 

well, join “quite” easy, problem decide when can remove node cluster.

you can’t remove node based on autoscaling rule, because can have messages queues have consume.

i decided execute script periodically running 2 master-node instances that:

  • checks messages count through api http://node:15672/api/queues
  • if messages count queue zero, can remove instance load balancer , rabbitmq cluster.

this broadly did, hope helps.


Comments