admin
不忘初心,方得始终
级别: 管理员
只看楼主 | | | 0楼 发表于:2015-11-03 17:59

corosync+pacemaker+openstack icehouse+rabbitmq server集群添加rabbitmq节点到集群中

备注:适用于CoroSync 2.3.3+Pacemaker+Rabbitmq集群+Openstack Icehouse环境
当在Openstack环境中创建虚拟机或进行其它操作失败,查看日志错误原因为RabbitMQ连接超时时,可执行以下操作尝试解决问题:
一、确认RabbitMQ状态,连接任意controller节点:
1)执行以下命令查看pacemaker资源状态,确认以下示例中红色标示信息中包含了所有controller节点
[root@node-3 ~](controller)# pcs resource
vip__public    (ocf::mirantis:ns_IPaddr2):    Started
Clone Set: clone_ping_vip__public [ping_vip__public]
     Started: [ node-1.abc.com node-2.abc.com node-3.abc.com ]
vip__management    (ocf::mirantis:ns_IPaddr2):    Started
Clone Set: clone_p_openstack-heat-engine [p_openstack-heat-engine]
     Started: [ node-1.abc.com node-2.abc.com node-3.abc.com ]
p_openstack-ceilometer-central    (ocf::mirantis:ceilometer-agent-central):    Started
p_openstack-ceilometer-alarm-evaluator    (ocf::mirantis:ceilometer-alarm-evaluator):    Started
Clone Set: clone_p_neutron-openvswitch-agent [p_neutron-openvswitch-agent]
     Started: [ node-1.abc.com node-2.abc.com node-3.abc.com ]
p_neutron-dhcp-agent    (ocf::mirantis:neutron-agent-dhcp):    Started
Clone Set: clone_p_neutron-metadata-agent [p_neutron-metadata-agent]
     Started: [ node-1.abc.com node-2.abc.com node-3.abc.com ]
Clone Set: clone_p_neutron-l3-agent [p_neutron-l3-agent]
     Started: [ node-1.abc.com node-2.abc.com node-3.abc.com ]
Clone Set: clone_p_mysql [p_mysql]
     Started: [ node-1.abc.com node-2.abc.com node-3.abc.com ]
Clone Set: clone_p_rabbitmq-server [p_rabbitmq-server]
     Started: [ node-1.abc.com node-2.abc.com node-3.abc.com ]

Clone Set: clone_p_haproxy [p_haproxy]
     Started: [ node-1.abc.com node-2.abc.com node-3.abc.com ]
2)如果以上信息正常,分别登陆所有controller节点执行以下命令查看rabbitmq集群状态,如果输出信息为类似以下信息,说明RabbitMQ集群出现问题:
[root@node-1 ~](controller)# rabbitmqctl cluster_status
Cluster status of node 'rabbit@node-1' ...
[{nodes,[{disc,['rabbit@node-1','rabbit@node-2']}]},
{running_nodes,['rabbit@node-1','rabbit@node-2']},
{cluster_name,<<"rabbit@node-1.abc.com">>},

{partitions,[]}]
...done.
[root@node-2 ~](controller)# rabbitmqctl cluster_status
Cluster status of node 'rabbit@node-2' ...
[{nodes,[{disc,['rabbit@node-1','rabbit@node-2']}]},
{running_nodes,['rabbit@node-1','rabbit@node-2']},
{cluster_name,<<"rabbit@node-1.abc.com">>},

{partitions,[]}]
...done.
Cluster status of node 'rabbit@node-3' ...
[{nodes,[{disc,['rabbit@node-3']}]},
{running_nodes,['rabbit@node-3']},
{cluster_name,<<"rabbit@node-3.abc.com">>},

{partitions,[]}]
...done.
如果看到类似以上信息,说明环境中出现了两个RabbitMQ集群,分别为{cluster_name,<<"rabbit@node-1.abc.com">>}{cluster_name,<<"rabbit@node-3.abc.com">>}
或者可以理解为node-3没有加入{cluster_name,<<"rabbit@node-1.abc.com">>}这个集群中
二、解决问题
通过以上分析,我们只需要将node-3加入{cluster_name,<<"rabbit@node-1.abc.com">>}这个集群中即可,登陆任意controller节点,执行以下命令
[root@node-3 ~](controller)# pcs resource ban p_rabbitmq-server node-3.abc.com
[root@node-3 ~](controller)# pcs resource clear p_rabbitmq-server node-3.abc.com
三、再次查看状态,确认问题已经解决
[root@node-3 ~](controller)# pcs resource
vip__public    (ocf::mirantis:ns_IPaddr2):    Started
Clone Set: clone_ping_vip__public [ping_vip__public]
     Started: [ node-1.abc.com node-2.abc.com node-3.abc.com ]
vip__management    (ocf::mirantis:ns_IPaddr2):    Started
Clone Set: clone_p_openstack-heat-engine [p_openstack-heat-engine]
     Started: [ node-1.abc.com node-2.abc.com node-3.abc.com ]
p_openstack-ceilometer-central    (ocf::mirantis:ceilometer-agent-central):    Started
p_openstack-ceilometer-alarm-evaluator    (ocf::mirantis:ceilometer-alarm-evaluator):    Started
Clone Set: clone_p_neutron-openvswitch-agent [p_neutron-openvswitch-agent]
     Started: [ node-1.abc.com node-2.abc.com node-3.abc.com ]
p_neutron-dhcp-agent    (ocf::mirantis:neutron-agent-dhcp):    Started
Clone Set: clone_p_neutron-metadata-agent [p_neutron-metadata-agent]
     Started: [ node-1.abc.com node-2.abc.com node-3.abc.com ]
Clone Set: clone_p_neutron-l3-agent [p_neutron-l3-agent]
     Started: [ node-1.abc.com node-2.abc.com node-3.abc.com ]
Clone Set: clone_p_mysql [p_mysql]
     Started: [ node-1.abc.com node-2.abc.com node-3.abc.com ]
Clone Set: clone_p_rabbitmq-server [p_rabbitmq-server]
     Started: [ node-1.abc.com node-2.abc.com node-3.abc.com ]

Clone Set: clone_p_haproxy [p_haproxy]
     Started: [ node-1.abc.com node-2.abc.com node-3.abc.com ]

[root@node-1 ~](controller)# rabbitmqctl cluster_status
Cluster status of node 'rabbit@node-1' ...
[{nodes,[{disc,['rabbit@node-1','rabbit@node-2','rabbit@node-3']}]},
{running_nodes,['rabbit@node-3','rabbit@node-2','rabbit@node-1']},
{cluster_name,<<"rabbit@node-1.abc.com">>},

{partitions,[]}]
...done.
[root@node-2 ~](controller)# rabbitmqctl cluster_status
Cluster status of node 'rabbit@node-2' ...
[{nodes,[{disc,['rabbit@node-1','rabbit@node-2','rabbit@node-3']}]},
{running_nodes,['rabbit@node-3','rabbit@node-1','rabbit@node-2']},
{cluster_name,<<"rabbit@node-1.abc.com">>},

{partitions,[]}]
...done.
[root@node-3 ~](controller)# rabbitmqctl cluster_status
Cluster status of node 'rabbit@node-3' ...
[{nodes,[{disc,['rabbit@node-1','rabbit@node-2','rabbit@node-3']}]},
{running_nodes,['rabbit@node-1','rabbit@node-2','rabbit@node-3']},
{cluster_name,<<"rabbit@node-1.abc.com">>},

{partitions,[]}]
...done.