erlang freezes our server

Hi all,

I am fighting with Erlang...

I have a linux Etch at home and all works on it.
But I have big truble with Erlang on our production server at dedibox.fr which is a linux Debian Sarge ( kernel: Linux version 2.6.18.1dedibox_r6_final (version gcc 3.3.5 (Debian 1:3.3.5-13)) running on a special hard config :
processor VIA C7 2 GHz, chipset CN700 (Northbridge) et VT8237R (southbridge), RAM 1024 Mo DDR2.

My first try, was to install Erlang from Debian Etch packages (apt-get install Erlang after put Etch url packages in sources.list).
This installation worked.
Erlang emulator worked.
So I've installed ejabberd (form svn).the "make" made a error dealing with gcc release.
I've done a "apt-get install binutils gcc-3.3"
After that ejabberd" "make" worked and I've installed and http_bind modules (compiling + copying its .beam to /var/lib/ejabberd).
All worked fine (erlang, ejabberd, http_bind).
I was still in testing so I've stopped ejabberd (/etc/init.d/ejabberd stop)

But some hours after (say about 20 hours), our server freezed !
Soft reboot had no effect, we had to do hard reboot.
Logs were totally empty.

We stay like that and server freezed again about 24 hours after.
We killed beam process and then the server was running with no crash during several days.

Following a linux developper advices, I tried yesterday, to install Erlang from sources instead that from packages (because of our hard config that seems special).
I've downloaded otp_src_R11B-4.tar.gz.
Configure...
And then, the "make" of Erlang => again our server freezed ! => hard reboot.

I have tried to contact erlang or ejabberd team but with no success.
I keep searching on the web some solutions but with no success for now.

Mickaël Rémond (process-one) have written (http://forum.jabberfr.org/viewtopic.php?pid=5315#p5315) that ejabberd runs with no problem on Dedibox servers.

any help, advices are very wellcome

thanks a lot

luc

this problem is maybe

this problem is maybe solved...
I have uninstall erlang/ejabberd and reinstall erlang via sources...
Now, I have to wait 1 or 2 days to confirm that it is ok or not.

luc

bad luck... the Erlang

bad luck... the Erlang installation by sources didn't solve the problem...
our server freezed again. Erlang/ejabberd must be the reason but how to debug without any log (server freezed so doesn't write anything else in logs) ?
everything seems ok once we kill "beam" process...

any help, solution or just idea of some test to make, is very wellcome!!!

thanks
luc

don't know if it help but

don't know if it help but here is the script with which I lunch ejabberd

#!/bin/sh
ERL=erl
NAME=ejabberd
case "$1" in
start)
echo "Starting $NAME."
cd /usr/src/ejabberd/src/
$ERL -pa /var/lib/ejabberd/ebin \
-sname ejabberd \
-s ejabberd \
-ejabberd config \"/etc/ejabberd/ejabberd.cfg\" \
log_path \"/var/log/ejabberd/ejabberd.log\" \
-sasl sasl_error_logger \{file,\"/var/log/ejabberd/sasl.log\"\} \
-mnesia dir \"/var/lib/ejabberd/spool\" \
-heart \
-detached \
;;
stop)
echo "Stopping $NAME."
echo "rpc:call('ejabberd@`hostname -s`', init, stop, [])." | $ERL -sname ejabberdstop
;;
status)
echo "Not implemented yet."
;;
restart|reload)
$0 stop
sleep 3
$0 start
;;
*)
echo "Usage: $0 {start|stop|status}"
exit 1
esac

I am looking for Erlang

I am looking for Erlang logs... don't know where they are ..

Do you know ?

thanks
luc

gcc-3.3

what for you install gcc-3.3 ? etch using gcc-4.1

if you want install ejabberd on etch you can install from apt, apt-get install ejabberd. apt will install all dependencies.

Several comments, I hope

Several comments, I hope some of them help you.

  • '-heart' is not required to run ejabberd. Try to remove it from your script.
  • Check the disk has enought free space. Check the log files are not of big size (larger than 1 gig, for example).
  • ejabberd logs indicate users login/logout. Maybe comparing the last messages of several crashed logs will hint you a direction to look for.
  • logs should be stored on /var/log/ejabberd/ejabberd.log and /var/log/ejabberd/sasl.log

You want to know what action triggers the server crash. It may be one or several of those ones:

  • After 20-24 hours of being started, even if nobody logins at all
  • After several hundreds of users login and logout
  • When the server is overloaded by other processes and ejabberd also wants CPU
  • When a specific user logins to his account, regardless of the client he uses
  • When a specific client and client options is used to login
  • When a user makes a specific action

A way to isolate the problem, you should determine if crash is triggered only by time, or by a specific action.
Start ejabberd on a friday night, configure it to listen on strange ports (9991 instead of 5222...) so NOBODY logins on it. Wait 24 hours to see if it crashes.

Another test: start ejabberd, let people login. Put the server to compress files, so CPU is overloaded.

If the crash is user-triggered, and it's always the same user, you can ask him which client does he use, and what options.

Good luck :)

Syndicate content