ejabberd not starting after filesystem issue

Hello,

First of all my system specifications:
- Linux 2.6 Kernel
- Debian Sid Packages of ejabberd

Recently my server had a filesystem issue.
After that ejabberd isn't starting any more.

The only processes I can see listening on a socket are:
0.0.0.0:4369 (TCP) - epmd
0.0.0.0:XXXXX (TCP) - beam

Here is the crasher report (/var/log/ejabberd/sasl.log):

=CRASH REPORT==== 31-Oct-2009::12:06:12 ===
  crasher:
    pid: <0.35.0>
    registered_name: []
    exception exit: {bad_return,
                        {{ejabberd_app,start,[normal,[]]},
                         {'EXIT',
                             {aborted,
                                 {node_not_running,ejabberd@server42}}}}}
      in function  application_master:init/4
    initial call: application_master:init(<0.5.0>,<0.34.0>,
                                          {appl_data,ejabberd,
                                           [ejabberd,ejabberd_sup,
                                            ejabberd_auth,ejabberd_router,
                                            ejabberd_sm,ejabberd_s2s,
                                            ejabberd_local,ejabberd_listeners,
                                            ejabberd_iq_sup,
                                            ejabberd_service_sup,
                                            ejabberd_s2s_out_sup,
                                            ejabberd_s2s_in_sup,
                                            ejabberd_c2s_sup,
                                            ejabberd_mod_roster,
                                            ejabberd_mod_echo,
                                            ejabberd_mod_pubsub,
                                            ejabberd_mod_irc,ejabberd_mod_muc,
                                            ejabberd_offline,random_generator],
                                           undefined,
                                           {ejabberd_app,[]},
                                           [acl,adhoc,configure,
                                            cyrsasl_anonymous,cyrsasl,
                                            cyrsasl_digest,cyrsasl_plain,
                                            ejabberd_admin,ejabberd_app,
                                            ejabberd_auth_anonymous,
                                            ejabberd_auth,
                                            ejabberd_auth_external,
                                            ejabberd_auth_internal,
                                            ejabberd_auth_ldap,
                                            ejabberd_auth_odbc,
                                            ejabberd_auth_pam,ejabberd,
                                            ejabberd_c2s,ejabberd_c2s_config,
                                            ejabberd_config,ejabberd_ctl,
                                            ejabberd_frontend_socket,
                                            ejabberd_hooks,ejabberd_http,
                                            ejabberd_http_bind,
                                            ejabberd_http_poll,
                                            ejabberd_listener,ejabberd_local,
                                            ejabberd_logger_h,
                                            ejabberd_loglevel,
                                            ejabberd_node_groups,
                                            ejabberd_rdbms,ejabberd_receiver,
                                            ejabberd_router,ejabberd_s2s,
                                            ejabberd_s2s_in,ejabberd_s2s_out,
                                            ejabberd_service,ejabberd_sm,
                                            ejabberd_socket,ejabberd_sup,
                                            ejabberd_system_monitor,
                                            ejabberd_tmp_sup,ejabberd_update,
                                            ejabberd_web_admin,ejabberd_web,
                                            ejabberd_zlib,ejd2odbc,eldap,
                                            eldap_filter,eldap_pool,
                                            eldap_utils,'ELDAPv3',extauth,
                                            gen_iq_handler,gen_mod,
                                            gen_pubsub_node,
                                            gen_pubsub_nodetree,iconv,idna,
                                            jd2ejd,jlib,mod_adhoc,
                                            mod_announce,mod_caps,
                                            mod_configure2,mod_configure,
                                            mod_ctlextra,mod_disco,mod_echo,
                                            mod_http_bind,mod_http_fileserver,
                                            mod_irc,mod_irc_connection,
                                            mod_last,mod_last_odbc,mod_muc,
                                            mod_muc_log,mod_muc_room,
                                            mod_offline,mod_offline_odbc,
                                            mod_privacy,mod_privacy_odbc,
                                            mod_private,mod_private_odbc,
                                            mod_proxy65,mod_proxy65_lib,
                                            mod_proxy65_service,
                                            mod_proxy65_sm,mod_proxy65_stream,
                                            mod_pubsub,mod_register,
                                            mod_roster,mod_roster_odbc,
                                            mod_service_log,mod_shared_roster,
                                            mod_stats,mod_time,mod_vcard,
                                            mod_vcard_ldap,mod_vcard_odbc,
                                            mod_version,node_buddy,node_club,
                                            node_default,node_dispatch,
                                            node_pep,node_private,node_public,
                                            nodetree_default,nodetree_virtual,
                                            p1_fsm,p1_mnesia,
                                            ram_file_io_server,randoms,sha,
                                            shaper,stringprep,stringprep_sup,
                                            tls,translate,xml,xml_stream,
                                            'XmppAddr'],
                                           [],infinity,infinity},
                                          normal)
    ancestors: [<0.34.0>]
    messages: [{'EXIT',<0.36.0>,normal}]
    links: [<0.34.0>,<0.5.0>]
    dictionary: []
    trap_exit: true
    status: running
    heap_size: 987
    stack_size: 23
    reductions: 114
  neighbours:

The ejabberd.log wasn't updated since the crash.

And this is mnesia_lib:view("file").

***** logfile *****
----- logfile: "/var/lib/ejabberd/PREVIOUS.LOG" -----
*****  "/tmp/mnesia_vcore_elem.TMP" *****

=ERROR REPORT==== 31-Oct-2009::12:14:39 ===
Mnesia(nonode@nohost): ** ERROR ** Cannot open log "/tmp/mnesia_vcore_elem.TMP": {not_a_log_file,
                                                                                  "/tmp/mnesia_vcore_elem.TMP"}
----- logfile: "/var/lib/ejabberd/LATEST.LOG" -----
*****  "/tmp/mnesia_vcore_elem.TMP" *****
{log_header,trans_log,"4.3","4.4.11",ejabberd@panoptikum,
            {1256,569932,104883}}
----- logfile: "/var/lib/ejabberd/DECISION_TAB.LOG" -----
*****  "/tmp/mnesia_vcore_elem.TMP" *****
{log_header,dcl_log,"1.0","4.4.11",ejabberd@panoptikum,
            {1256,569572,105492}}

I guess the solution could be touching /tmp/mnesia_vcore_elem.TMP?

Not fixed yet.

Hello,

Touching the /tmp/mnesia_vcore_elem.TMP did not fix the problem.

Still no solution

Hmm, the Problem still exists. Any more suggestions?

More verbose output from su - ejabberd -c /usr/sbin/ejabberd

{error_logger,{{2009,11,11},{17,17,5}},"Protocol: ~p: register error: ~p~n",["inet_tcp",{{badmatch,{error,duplicate_name}},[{inet_tcp_dist,listen,1},{net_kernel,start_protos,4},{net_kernel,start_protos,3},{net_kernel,init_node,2},{net_kernel,init,1},{gen_server,init_it,6},{proc_lib,init_p,5}]}]}
{error_logger,{{2009,11,11},{17,17,5}},crash_report,[[{pid,<0.20.0>},{registered_name,net_kernel},{error_info,{exit,{error,badarg},[{gen_server,init_it,6},{proc_lib,init_p,5}]}},{initial_call,{gen,init_it,[gen_server,<0.17.0>,<0.17.0>,{local,net_kernel},net_kernel,{ejabberd,shortnames,15000},[]]}},{ancestors,[net_sup,kernel_sup,<0.8.0>]},{messages,[]},{links,[#Port<0.7>,<0.17.0>]},{dictionary,[{longnames,false}]},{trap_exit,true},{status,running},{heap_size,610},{stack_size,23},{reductions,453}],[]]}
{error_logger,{{2009,11,11},{17,17,5}},supervisor_report,[{supervisor,{local,net_sup}},{errorContext,start_error},{reason,{'EXIT',nodistribution}},{offender,[{pid,undefined},{name,net_kernel},{mfa,{net_kernel,start_link,[[ejabberd,shortnames]]}},{restart_type,permanent},{shutdown,2000},{child_type,worker}]}]}
{error_logger,{{2009,11,11},{17,17,5}},supervisor_report,[{supervisor,{local,kernel_sup}},{errorContext,start_error},{reason,shutdown},{offender,[{pid,undefined},{name,net_sup},{mfa,{erl_distribution,start_link,[]}},{restart_type,permanent},{shutdown,infinity},{child_type,supervisor}]}]}
{error_logger,{{2009,11,11},{17,17,5}},crash_report,[[{pid,<0.7.0>},{registered_name,[]},{error_info,{exit,{shutdown,{kernel,start,[normal,[]]}},[{application_master,init,4},{proc_lib,init_p,5}]}},{initial_call,{application_master,init,[<0.5.0>,<0.6.0>,{appl_data,kernel,[application_controller,erl_reply,auth,boot_server,code_server,disk_log_server,disk_log_sup,erl_prim_loader,error_logger,file_server_2,fixtable_server,global_group,global_name_server,heart,init,kernel_config,kernel_sup,net_kernel,net_sup,rex,user,os_server,ddll_server,erl_epmd,inet_db,pg2],undefined,{kernel,[]},[application,application_controller,application_master,application_starter,auth,code,code_aux,packages,code_server,dist_util,erl_boot_server,erl_distribution,erl_prim_loader,erl_reply,erlang,error_handler,error_logger,file,file_server,file_io_server,prim_file,global,global_group,global_search,group,heart,hipe_unified_loader,inet6_tcp,inet6_tcp_dist,inet6_udp,inet_config,inet_hosts,inet_gethost_native,inet_tcp_dist,init,kernel,kernel_config,net,net_adm,net_kernel,os,ram_file,rpc,user,user_drv,user_sup,disk_log,disk_log_1,disk_log_server,disk_log_sup,dist_ac,erl_ddll,erl_epmd,erts_debug,gen_tcp,gen_udp,gen_sctp,prim_inet,inet,inet_db,inet_dns,inet_parse,inet_res,inet_tcp,inet_udp,inet_sctp,pg2,seq_trace,wrap_log_reader,zlib,otp_ring0],[],infinity,infinity},normal]}},{ancestors,[<0.6.0>]},{messages,[{'EXIT',<0.8.0>,normal}]},{links,[<0.6.0>,<0.5.0>]},{dictionary,[]},{trap_exit,true},{status,running},{heap_size,610},{stack_size,23},{reductions,127}],[]]}
{error_logger,{{2009,11,11},{17,17,5}},std_info,[{application,kernel},{exited,{shutdown,{kernel,start,[normal,[]]}}},{type,permanent}]}
{"Kernel pid terminated",application_controller,"{application_start_failure,kernel,{shutdown,{kernel,start,[normal,[]]}}}"}

Crash dump was written to: /var/log/ejabberd/erl_crash.dump
Kernel pid terminated (application_controller) ({application_start_failure,kernel,{shutdown,{kernel,start,[normal,[]]}}})

Second run

p:~# su - ejabberd -c /usr/sbin/ejabberd
Erlang (BEAM) emulator version 5.6.3 [source] [async-threads:0] [kernel-poll:false]

Eshell V5.6.3  (abort with ^G)
(ejabberd@p)1>
=ERROR REPORT==== 11-Nov-2009::17:18:40 ===
Mnesia(ejabberd@p): ** ERROR ** (core dumped to file: "/var/lib/ejabberd/MnesiaCore.ejabberd@p_1257_956320_152964")
** FATAL ** mnesia_recover crashed: {"Bad decision log item",
                                      {log_header,dcl_log,"1.0","4.4.11",
                                       ejabberd@p,
                                       {1256,569572,105492}},
                                      load_decision_tab} state: {state,
                                                                 <0.67.0>,
                                                                 undefined,
                                                                 undefined,
                                                                 undefined,0,
                                                                 false,[]}

=ERROR REPORT==== 11-Nov-2009::17:18:50 ===
** Generic server mnesia_monitor terminating
** Last message in was {'EXIT',<0.67.0>,killed}
** When Server state == {state,<0.67.0>,[],[],false,[],undefined,[]}
** Reason for termination ==
** killed

=ERROR REPORT==== 11-Nov-2009::17:18:50 ===
Mnesia(ejabberd@p): ** ERROR ** mnesia_event got unexpected event: {'EXIT',
                                                                             <0.69.0>,
                                                                             killed}

=INFO REPORT==== 11-Nov-2009::17:18:50 ===
    application: mnesia
    exited: {killed,{mnesia_sup,start,[normal,[]]}}
    type: temporary

=INFO REPORT==== 11-Nov-2009::17:18:50 ===
    application: ejabberd
    exited: {bad_return,
                {{ejabberd_app,start,[normal,[]]},
                 {'EXIT',{aborted,{node_not_running,ejabberd@p}}}}}
    type: temporary

For explanation of this error

For explanation of this error message: ["inet_tcp",{{badmatch,{error,duplicate_name}},
see: error, duplicate_name

The most simple solution is

The most simple solution is to remove the spool files; when ejabberd starts, it will create them, empty. Of course, the problem in this case is that you loose all user accounts. You can them attempt to copy the files of tables you consider important (passwd.*, roster.*, ...). Maybe Mnesia accepts those old files and works correctly.

Another idea: maybe the problem is only with the vcard files? In that case, you can try to remove the files files vcard* and restart ejabberd. Of course you lose Vcard information, but that's preferable than not having any info.

Once solved, remember to write a script to make daily, or at least weekly backups to another machine.

Syndicate content