Hi there,
I'm working on making ejabberd clustering and here are steps I followed
1.Get ejabberd install on master and slaves
2.Configure master erlang node name as ejabberd@ejabberddemo.localdomain
3.Configure slave erlang node name as ejabberd@ejabberddemo01.localdomain
4.Start copy data from primary,
cd /opt/ejabberd-15.11/
/opt/ejabberd-15.11/bin/ejabberdctl stop
sleep 10
erl -noshell +v -name ejabberd@ejabberddemo01.localdomain \
-mnesia extra_db_nodes [ejabberd@ejabberddemo.localdomian] \
-s mnesia -setcookie $COOKIE \
-eval "mnesia:change_table_copy_type(schema, node(), disc_copies)" \
-eval "mnesia:add_table_copy(acl, node(), disc_copies)" \
-eval "mnesia:add_table_copy(captcha, node(), ram_copies)" \
-eval "mnesia:add_table_copy(config, node(), disc_copies)" \
-eval "mnesia:add_table_copy(http_bind, node(), ram_copies)" \
-eval "mnesia:add_table_copy(iq_response, node(), ram_copies)" \
-eval "mnesia:add_table_copy(local_config, node(), disc_copies)" \
-eval "mnesia:add_table_copy(mod_register_ip, node(), ram_copies)" \
-eval "mnesia:add_table_copy(motd, node(), disc_copies)" \
-eval "mnesia:add_table_copy(motd_users, node(), disc_copies)" \
-eval "mnesia:add_table_copy(muc_online_room, node(), ram_copies)" \
-eval "mnesia:add_table_copy(muc_registered, node(), disc_copies)" \
-eval "mnesia:add_table_copy(muc_room, node(), disc_copies)" \
-eval "mnesia:add_table_copy(route, node(), ram_copies)" \
-eval "mnesia:add_table_copy(s2s, node(), ram_copies)" \
-eval "mnesia:add_table_copy(session, node(), ram_copies)" \
-eval "mnesia:add_table_copy(session_counter, node(), ram_copies)" \
-eval "mnesia:add_table_copy(temporarily_blocked, node(), ram_copies)" \
-eval "init:stop()"
rm -f /opt/ejabberd-15.11/database/$LOCAL_NODE/*
mv /opt/ejabberd-15.11/Mnesia.$LOCAL_NODE/* /opt/ejabberd-15.11/database/$LOCAL_NODE/
5. Start slave node
Here comes the weird behaviours , when slave node start, master slave start throwing errors like
[error] <0.1062.0>@ejabberd_s2s:route:97 {badarg,[{ets,lookup,[local_config,{route_subdomains,<<"ejabberddemo.localdomain">>}],[]},{ejabberd_config,get_option,3,[{file,"src/ejabberd_config.erl"},{line,749}]},{ejabberd_s2s,is_service,2,[{file,"src/ejabberd_s2s.erl"},{line,461}]},{ejabberd_s2s,find_connection,2,[{file,"src/ejabberd_s2s.erl"},{line,351}]},{ejabberd_s2s,do_route,3,[{file,"src/ejabberd_s2s.erl"},{line,308}]},{ejabberd_s2s,route,3,[{file,"src/ejabberd_s2s.erl"},{line,95}]},{ejabberd_router,route,3,[{file,"src/ejabberd_router.erl"},{line,75}]},{ejabberd_c2s,check_privacy_route,5,[{file,"src/ejabberd_c2s.erl"},{line,2130}]}]}
when processing: {{jid,<<"aze_admin">>,<<"ejabberddemo.localdomain">>,<<"exmpp#1462956344339266">>,<<"genius_admin">>,<<"ejabberddemo.localdomain">>,<<"exmpp#1462956344339266">>},{jid,<<>>,<<"conference.xmpp.ejabberddemo.localdomain">>,<<>>,<<>>,<<"conference.xmpp.ejabberddemo.localdomain">>,<<>>},{xmlel,<<"iq">>,[{<<"type">>,<<"get">>},{<<"to">>,<<"conference.xmpp.ejabberddemo.localdomain">>},{<<"id">>,<<"iq-1905181425">>}],[{xmlel,<<"query">>,[{<<"xmlns">>,<<"http://jabber.org/protocol/disco#items">>}],[]}]}
Meantime master web admin console is unavailable , but in slave console , you can see the primary node is running and both nodes mnesia info show two running db nodes . When I tried to bring 3rd node into cluster , after copying the data from the primary and start the 3rd node, both of the two nodes start throwing those errors and admin console went down. However 3rd node's console is working just like the previous slave node.
Anyone have ideas about it ?