Server crashes

Crashes started as of today, when large amount of presences are send to the server (It's the trigger, but not always the crash occurs). Here is a crash log:
=INFO REPORT==== 22-Jan-2007::15:56:50 ===
I(<0.2284.0>:ejabberd_s2s_out:660): terminated: normal
Mnesia(ejabberd@host): Data may be missing, Corrupt logfile deleted: "/var/lib/ejabberd/PREVIOUS.LOG", {file_error, "/var/lib/ejabberd/PREVIOUS.LOG", emfile}
=ERROR REPORT==== 22-Jan-2007::15:56:52 ===
Mnesia(ejabberd@host): ** ERROR ** (could not write core file: emfile)
** FATAL ** Cannot open log file "/var/lib/ejabberd/PREVIOUS.LOG": {file_error, /var/lib/ejabberd/PREVIOUS.LOG", emfile}
=ERROR REPORT==== 22-Jan-2007::15:56:52 ===
** Generic server ejabberd_sm terminating
** Last message in was {mnesia_system_event,{mnesia_down,ejabberd@host}}
** When Server state == {state}
** Reason for termination ==
** {node_not_running,ejabberd@host}

I've talked with badlop, and temporary solution is to create this file manually. This helped, but file dissapeared a while after server start. What could be the problem (file limit is not possible, because ulimit is "unlimited")?

Small comment regarding ulimit -n

koniczynek wrote:

What could be the problem (file limit is not possible, because ulimit is "unlimited")?

To know the maximum number of allowed file descriptors, you must run 'ulimit -n':

$ ulimit -n
1024

server crashes after migrating company to use shared roster

I am getting server crashes very often, in logs:

=ERROR REPORT==== 2008-04-08 16:03:43 ===
Mnesia(ejabberd@debian): ** ERROR ** (could not write core file: emfile)
** FATAL ** Cannot open log file "/var/lib/ejabberd/roster.DCL": {file_error,
"/var/lib/ejabberd/roster.DCL",
emfile}

=ERROR REPORT==== 2008-04-08 16:03:43 ===
** Generic server ejabberd_sm terminating
** Last message in was {mnesia_system_event,{mnesia_down,ejabberd@debian}}
** When Server state == {state}
** Reason for termination ==
** {node_not_running,ejabberd@debian}

=INFO REPORT==== 2008-04-08 16:03:43 ===
application: mnesia
exited: shutdown
type: temporary

---------------

=ERROR REPORT==== 2008-04-09 12:49:53 ===
Mnesia(ejabberd@debian): ** ERROR ** (could not write core file: emfile)
** FATAL ** Cannot open log file "/var/lib/ejabberd/PREVIOUS.LOG": {file_error,
"/var/lib/ejabberd/PREVIOUS.LOG",
emfile}

=ERROR REPORT==== 2008-04-09 12:49:53 ===
** Generic server ejabberd_sm terminating
** Last message in was {mnesia_system_event,{mnesia_down,ejabberd@debian}}
** When Server state == {state}
** Reason for termination ==
** {node_not_running,ejabberd@debian}

=INFO REPORT==== 2008-04-09 12:49:53 ===
application: mnesia
exited: shutdown
type: temporary

------------
until mirating company to use shared roster all was ok.

Linux Debian Etch
# ulimit -n
1024

What should I do to stop server crashes ?

You are running out of file

You are running out of file descriptor. Increase your ulimit.

--
Mickaël Rémond
Process-one

I've increased ulimit to

I've increased ulimit to 2048, but problem still exist, than i've increased to 8192... but ejabberd still crashe..

# ulimit -n
8192

We are running jabber on company with nearly 50-60 employees;
Until migration to shared roster the ejabberd never crashes

=ERROR REPORT==== 2008-04-10 15:28:59 ===
Mnesia(ejabberd@debian): ** ERROR ** (could not write core file: emfile)
** FATAL ** Cannot open log file "/var/lib/ejabberd/PREVIOUS.LOG": {file_error,
"/var/lib/ejabberd/PREVIOUS.LOG",
emfile}

=ERROR REPORT==== 2008-04-10 15:28:59 ===
** Generic server ejabberd_sm terminating
** Last message in was {mnesia_system_event,{mnesia_down,ejabberd@debian}}
** When Server state == {state}
** Reason for termination ==
** {node_not_running,ejabberd@debian}

=INFO REPORT==== 2008-04-10 15:28:59 ===
application: mnesia
exited: shutdown
type: temporary

=INFO REPORT==== 2008-04-10 15:28:59 ===
application: ejabberd
exited: shutdown
type: temporary

Thanks in advance!!

system wide increase

Had some trouble with this, as PAM and root user restrictions obfuscated the setting. Here's the bug report:

https://bugs.launchpad.net/ubuntu/+source/pam/+bug/65244

Problem was that I set the value in limits.conf but PAM was not honoring it for ssh sessions. Then I used the * wildcard which was not honoring the value for the root user. The ejabberd user did have the proper ulimit applied.

Syndicate content