Environment:
-CentOS 5.4 X86_64
-Two servers ( jabber111 & jabber112 )
-Ejabberd 2.1.6, installed from binary
-Verified I am able to telnet between machines, and from outside the firewall to the 5280 & 5222. Internally I am also able to telnet 4369 internally
-Cookie set and chmod 600 in /root
I start both instances, and they work independently, but the database is not syncing. I have kind of been basing my install on the guide posted
To start eJabberD I am running the following command /opt/ejabberd-2.1.6/bin/ejabberdctl on a vanilla clustered install, the ps output is below, which may answer any enviroment questions specific to ejabber:
root 5951 0.0 0.8 197632 51716 pts/0 Sl 19:19 0:01 /opt/ejabberd-2.1.6/bin/beam.smp -K true -P 250000 -- -root /opt/ejabberd-2.1.6 -progname /opt/ejabberd-2.1.6/bin/erl -- -home /root -name ejabberd@jabber112.orl.___.net -smp auto -noshell -noinput -noshell -noinput -mnesia dir "/opt/ejabberd-2.1.6/database/ejabberd@jabber112.orl.____.net" -s ejabberd -ejabberd config "/opt/ejabberd-2.1.6/conf/ejabberd.cfg" log_path "/opt/ejabberd-2.1.6/logs/ejabberd.log" -sasl sasl_error_logger {file,"/opt/ejabberd-2.1.6/logs/erlang.log"}
However I can not get both instances to start after syncing the databases
If I start one I get this in the erlang.log
{restart_type,permanent},
{shutdown,brutal_kill},
{child_type,worker}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,ejabberd_sup}
started: [{pid,<0.246.0>},
{name,ejabberd_sm},
{mfa,{ejabberd_sm,start_link,[]}},
{restart_type,permanent},
{shutdown,brutal_kill},
{child_type,worker}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,ejabberd_sup}
started: [{pid,<0.253.0>},
{name,ejabberd_s2s},
{mfa,{ejabberd_s2s,start_link,[]}},
{restart_type,permanent},
{shutdown,brutal_kill},
{child_type,worker}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,ejabberd_sup}
started: [{pid,<0.256.0>},
{name,ejabberd_local},
{mfa,{ejabberd_local,start_link,[]}},
{restart_type,permanent},
{shutdown,brutal_kill},
{child_type,worker}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,ejabberd_sup}
started: [{pid,<0.259.0>},
{name,ejabberd_captcha},
{mfa,{ejabberd_captcha,start_link,[]}},
{restart_type,permanent},
{shutdown,brutal_kill},
{child_type,worker}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,ejabberd_sup}
started: [{pid,<0.262.0>},
{name,ejabberd_receiver_sup},
{mfa,
{ejabberd_tmp_sup,start_link,
[ejabberd_receiver_sup,ejabberd_receiver]}},
{restart_type,permanent},
{shutdown,infinity},
{child_type,supervisor}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,ejabberd_sup}
started: [{pid,<0.263.0>},
{name,ejabberd_c2s_sup},
{mfa,
{ejabberd_tmp_sup,start_link,
[ejabberd_c2s_sup,ejabberd_c2s]}},
{restart_type,permanent},
{shutdown,infinity},
{child_type,supervisor}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,ejabberd_sup}
started: [{pid,<0.264.0>},
{name,ejabberd_s2s_in_sup},
{mfa,
{ejabberd_tmp_sup,start_link,
[ejabberd_s2s_in_sup,ejabberd_s2s_in]}},
{restart_type,permanent},
{shutdown,infinity},
{child_type,supervisor}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,ejabberd_sup}
started: [{pid,<0.265.0>},
{name,ejabberd_s2s_out_sup},
{mfa,
{ejabberd_tmp_sup,start_link,
[ejabberd_s2s_out_sup,ejabberd_s2s_out]}},
{restart_type,permanent},
{shutdown,infinity},
{child_type,supervisor}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,ejabberd_sup}
started: [{pid,<0.266.0>},
{name,ejabberd_service_sup},
{mfa,
{ejabberd_tmp_sup,start_link,
[ejabberd_service_sup,ejabberd_service]}},
{restart_type,permanent},
{shutdown,infinity},
{child_type,supervisor}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,ejabberd_sup}
started: [{pid,<0.267.0>},
{name,ejabberd_http_sup},
{mfa,
{ejabberd_tmp_sup,start_link,
[ejabberd_http_sup,ejabberd_http]}},
{restart_type,permanent},
{shutdown,infinity},
{child_type,supervisor}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,ejabberd_sup}
started: [{pid,<0.268.0>},
{name,ejabberd_http_poll_sup},
{mfa,
{ejabberd_tmp_sup,start_link,
[ejabberd_http_poll_sup,ejabberd_http_poll]}},
{restart_type,permanent},
{shutdown,infinity},
{child_type,supervisor}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,ejabberd_sup}
started: [{pid,<0.269.0>},
{name,ejabberd_iq_sup},
{mfa,
{ejabberd_tmp_sup,start_link,
[ejabberd_iq_sup,gen_iq_handler]}},
{restart_type,permanent},
{shutdown,infinity},
{child_type,supervisor}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,ejabberd_sup}
started: [{pid,<0.270.0>},
{name,ejabberd_stun_sup},
{mfa,
{ejabberd_tmp_sup,start_link,
[ejabberd_stun_sup,ejabberd_stun]}},
{restart_type,permanent},
{shutdown,infinity},
{child_type,supervisor}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,ejabberd_sup}
started: [{pid,<0.271.0>},
{name,ejabberd_frontend_socket_sup},
{mfa,
{ejabberd_tmp_sup,start_link,
[ejabberd_frontend_socket_sup,
ejabberd_frontend_socket]}},
{restart_type,permanent},
{shutdown,infinity},
{child_type,supervisor}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,ejabberd_sup}
started: [{pid,<0.272.0>},
{name,cache_tab_sup},
{mfa,{cache_tab_sup,start_link,[]}},
{restart_type,permanent},
{shutdown,infinity},
{child_type,supervisor}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,ejabberd_sup}
started: [{pid,<0.273.0>},
{name,ejabberd_listener},
{mfa,{ejabberd_listener,start_link,[]}},
{restart_type,permanent},
{shutdown,infinity},
{child_type,supervisor}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,cache_tab_sup}
started: [{pid,<0.284.0>},
{name,{caps_features,cache_tab_caps_features_1}},
{mfa,
{cache_tab,start_link,
[cache_tab_caps_features_1,caps_features,
[{max_size,1000},{life_time,86400}],
<0.281.0>]}},
{restart_type,permanent},
{shutdown,brutal_kill},
{child_type,worker}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,cache_tab_sup}
started: [{pid,<0.285.0>},
{name,{caps_features,cache_tab_caps_features_2}},
{mfa,
{cache_tab,start_link,
[cache_tab_caps_features_2,caps_features,
[{max_size,1000},{life_time,86400}],
<0.281.0>]}},
{restart_type,permanent},
{shutdown,brutal_kill},
{child_type,worker}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,cache_tab_sup}
started: [{pid,<0.286.0>},
{name,{caps_features,cache_tab_caps_features_3}},
{mfa,
{cache_tab,start_link,
[cache_tab_caps_features_3,caps_features,
[{max_size,1000},{life_time,86400}],
<0.281.0>]}},
{restart_type,permanent},
{shutdown,brutal_kill},
{child_type,worker}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,cache_tab_sup}
started: [{pid,<0.287.0>},
{name,{caps_features,cache_tab_caps_features_4}},
{mfa,
{cache_tab,start_link,
[cache_tab_caps_features_4,caps_features,
[{max_size,1000},{life_time,86400}],
<0.281.0>]}},
{restart_type,permanent},
{shutdown,brutal_kill},
{child_type,worker}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,cache_tab_sup}
started: [{pid,<0.288.0>},
{name,{caps_features,cache_tab_caps_features_5}},
{mfa,
{cache_tab,start_link,
[cache_tab_caps_features_5,caps_features,
[{max_size,1000},{life_time,86400}],
<0.281.0>]}},
{restart_type,permanent},
{shutdown,brutal_kill},
{child_type,worker}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,cache_tab_sup}
started: [{pid,<0.289.0>},
{name,{caps_features,cache_tab_caps_features_6}},
{mfa,
{cache_tab,start_link,
[cache_tab_caps_features_6,caps_features,
[{max_size,1000},{life_time,86400}],
<0.281.0>]}},
{restart_type,permanent},
{shutdown,brutal_kill},
{child_type,worker}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,cache_tab_sup}
started: [{pid,<0.290.0>},
{name,{caps_features,cache_tab_caps_features_7}},
{mfa,
{cache_tab,start_link,
[cache_tab_caps_features_7,caps_features,
[{max_size,1000},{life_time,86400}],
<0.281.0>]}},
{restart_type,permanent},
{shutdown,brutal_kill},
{child_type,worker}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,cache_tab_sup}
started: [{pid,<0.291.0>},
{name,{caps_features,cache_tab_caps_features_8}},
{mfa,
{cache_tab,start_link,
[cache_tab_caps_features_8,caps_features,
[{max_size,1000},{life_time,86400}],
<0.281.0>]}},
{restart_type,permanent},
{shutdown,brutal_kill},
{child_type,worker}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,ejabberd_sup}
started: [{pid,<0.281.0>},
{name,'ejabberd_mod_caps_xmpp._____.com'},
{mfa,{mod_caps,start_link,["xmpp._____.com",[]]}},
{restart_type,transient},
{shutdown,1000},
{child_type,worker}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,ejabberd_sup}
started: [{pid,<0.297.0>},
{name,'ejabberd_mod_http_bind_xmpp._____.com'},
{mfa,
{ejabberd_tmp_sup,start_link,
['ejabberd_mod_http_bind_xmpp._____.com',
ejabberd_http_bind]}},
{restart_type,permanent},
{shutdown,infinity},
{child_type,supervisor}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,ejabberd_sup}
started: [{pid,<0.298.0>},
{name,'ejabberd_mod_irc_sup_xmpp._____.com'},
{mfa,
{ejabberd_tmp_sup,start_link,
['ejabberd_mod_irc_sup_xmpp._____.com',
mod_irc_connection]}},
{restart_type,permanent},
{shutdown,infinity},
{child_type,supervisor}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,ejabberd_sup}
started: [{pid,<0.299.0>},
{name,'ejabberd_mod_irc_xmpp._____.com'},
{mfa,{mod_irc,start_link,["xmpp._____.com",[]]}},
{restart_type,temporary},
{shutdown,1000},
{child_type,worker}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,ejabberd_sup}
started: [{pid,<0.305.0>},
{name,'ejabberd_mod_muc_sup_xmpp._____.com'},
{mfa,
{ejabberd_tmp_sup,start_link,
['ejabberd_mod_muc_sup_xmpp._____.com',
mod_muc_room]}},
{restart_type,permanent},
{shutdown,infinity},
{child_type,supervisor}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,ejabberd_sup}
started: [{pid,<0.306.0>},
{name,'ejabberd_mod_muc_xmpp._____.com'},
{mfa,{mod_muc,start_link,
["xmpp._____.com",
[{access,muc},
{access_create,muc_create},
{access_persistent,muc_create},
{access_admin,muc_admin}]]}},
{restart_type,temporary},
{shutdown,1000},
{child_type,worker}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,ejabberd_sup}
started: [{pid,<0.318.0>},
{name,'ejabberd_mod_pubsub_xmpp._____.com'},
{mfa,
{mod_pubsub,start_link,
["xmpp._____.com",
[{access_createnode,pubsub_createnode},
{ignore_pep_from_offline,true},
{last_item_cache,false},
{plugins,["flat","hometree","pep"]}]]}},
{restart_type,transient},
{shutdown,1000},
{child_type,worker}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,ejabberd_listeners}
started: [{pid,<0.370.0>},
{name,{5222,{0,0,0,0},tcp}},
{mfa,
{ejabberd_listener,start,
[{5222,{0,0,0,0},tcp},
ejabberd_c2s,
[{certfile,
"/opt/ejabberd-2.1.6/conf/server.pem"},
starttls,
{access,c2s},
{shaper,c2s_shaper},
{max_stanza_size,65536}]]}},
{restart_type,transient},
{shutdown,brutal_kill},
{child_type,worker}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,ejabberd_listeners}
started: [{pid,<0.371.0>},
{name,{5269,{0,0,0,0},tcp}},
{mfa,
{ejabberd_listener,start,
[{5269,{0,0,0,0},tcp},
ejabberd_s2s_in,
[{shaper,s2s_shaper},
{max_stanza_size,131072}]]}},
{restart_type,transient},
{shutdown,brutal_kill},
{child_type,worker}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
supervisor: {local,ejabberd_listeners}
started: [{pid,<0.372.0>},
{name,{5280,{0,0,0,0},tcp}},
{mfa,
{ejabberd_listener,start,
[{5280,{0,0,0,0},tcp},
ejabberd_http,
[captcha,http_bind,http_poll,web_admin]]}},
{restart_type,transient},
{shutdown,brutal_kill},
{child_type,worker}]
=PROGRESS REPORT==== 15-Apr-2011::20:20:22 ===
application: ejabberd
started_at: 'ejabberd@ejabber112.orl._____.net'
And when I go to start the failing node I get the following:
tail: /opt/ejabberd-2.1.6/logs/erlang.log: file truncated
=SUPERVISOR REPORT==== 15-Apr-2011::20:19:02 ===
Supervisor: {local,mnesia_sup}
Context: start_error
Reason: killed
Offender: [{pid,undefined},
{name,mnesia_kernel_sup},
{mfa,{mnesia_kernel_sup,start,[]}},
{restart_type,permanent},
{shutdown,infinity},
{child_type,supervisor}]
=CRASH REPORT==== 15-Apr-2011::20:19:02 ===
crasher:
pid: <0.65.0>
registered_name: mnesia_recover
exception exit: killed
in function gen_server:terminate/6
initial call: gen:init_it(gen_server,<0.61.0>,<0.61.0>,
{local,mnesia_recover},
mnesia_recover,
[<0.61.0>],
[{timeout,infinity}])
ancestors: [mnesia_kernel_sup,mnesia_sup,<0.58.0>]
messages: []
links: [<0.89.0>]
dictionary: []
trap_exit: true
status: running
heap_size: 2584
stack_size: 23
reductions: 3394
neighbours:
=CRASH REPORT==== 15-Apr-2011::20:19:02 ===
crasher:
pid: <0.57.0>
registered_name: []
exception exit: {shutdown,{mnesia_sup,start,[normal,[]]}}
in function application_master:init/4
initial call: application_master:init(<0.5.0>,<0.56.0>,
{appl_data,mnesia,
[mnesia_dumper_load_regulator,
mnesia_event,mnesia_fallback,
mnesia_controller,
mnesia_kernel_sup,
mnesia_late_loader,mnesia_locker,
mnesia_monitor,mnesia_recover,
mnesia_substr,mnesia_sup,
mnesia_tm],
undefined,
{mnesia_sup,[]},
[mnesia,mnesia_backup,mnesia_bup,
mnesia_checkpoint,
mnesia_checkpoint_sup,
mnesia_controller,mnesia_dumper,
mnesia_event,mnesia_frag,
mnesia_frag_hash,
mnesia_frag_old_hash,mnesia_index,
mnesia_kernel_sup,
mnesia_late_loader,mnesia_lib,
mnesia_loader,mnesia_locker,
mnesia_log,mnesia_monitor,
mnesia_recover,mnesia_registry,
mnesia_schema,mnesia_snmp_hook,
mnesia_snmp_sup,mnesia_subscr,
mnesia_sup,mnesia_sp,mnesia_text,
mnesia_tm],
[],infinity,infinity},
normal)
ancestors: [<0.56.0>]
messages: [{'EXIT',<0.58.0>,normal}]
links: [<0.56.0>,<0.5.0>]
dictionary: []
trap_exit: true
status: running
heap_size: 610
stack_size: 23
reductions: 117
neighbours:
It doesnt matter what order I start the nodes, I can only get the first one to start, and in the admin panel I see one node listed as started, and the second 'stopped", so It would seem that they are aware of each other....
Now here is the real kicker... I can get them both to show up if I run the following command
erl -name
However as soon as I exit the admin panel reports the second instances as down again. IfI try to sync the database from the erlang console:
Erlang (BEAM) emulator version 5.6.5 [source] [64-bit] [smp:8] [async-threads:0] [hipe] [kernel-poll:false]
Eshell V5.6.5 (abort with ^G)
(ejabberd@jabber111.orl.______.net )1> mnesia:change_table_copy_type(schema, node(), disc_copies).
{aborted,{already_exists,schema,
'ejabberd@jabber111.orl._____.net',disc_copies}}
Any suggestions, I have been racking my brain on this for two days, and am at my wits end!@# :)
thanks!
John