We are very much happy using ejabberd for our business and it is working fine except for last couple of weeks. We are experiencing ejabberd getting crashed when process memory (beam.smp) reaches close to 2GB limit, this exactly occur when there was increase in chat traffic that led to 1500 concurrent MUC online rooms with close to 2000 concurrent user sessions.
Here is the crash dump1 (head -50 lines)
=erl_crash_dump:0.1 Mon Feb 1 01:58:24 2010 Slogan: eheap_alloc: Cannot allocate 3328160 bytes of memory (of type "old_heap"). System version: Erlang (BEAM) emulator version 5.6.5 [source] [smp:16] [async-threads:256] [hipe] [kernel-poll:true] Compiled: Mon Jan 18 10:14:45 2010 Atoms: 13572 =memory total: 2564168880 processes: 2270958140 processes_used: 2270922444 system: 293210740 atom: 621293 atom_used: 604146 binary: 23256960 code: 6167719 ets: 11424008 =hash_table:atom_tab size: 9643 used: 7277 objs: 13572 depth: 7 =index_table:atom_tab size: 14336 limit: 1048576 entries: 13572 =hash_table:module_code size: 397 used: 179 objs: 247 ... ...
Here is the crash dump2 (head -50 lines)
=erl_crash_dump:0.1 Thu Jan 28 00:39:13 2010 Slogan: eheap_alloc: Cannot allocate 5385076 bytes of memory (of type "old_heap"). System version: Erlang (BEAM) emulator version 5.6.5 [source] [smp:16] [async-threads:0] [hipe] [kernel-poll:true] Compiled: Mon Jan 18 10:14:45 2010 Atoms: 13386 =memory total: 2428284024 processes: 2370573312 processes_used: 2370563440 system: 57710712 atom: 616937 atom_used: 597215 binary: 26513024 code: 6020365 ets: 11399276 =hash_table:atom_tab size: 9643 used: 7233 objs: 13386 depth: 7 =index_table:atom_tab size: 14336 limit: 1048576 entries: 13386 =hash_table:module_code size: 197 used: 151 objs: 240 depth: 5 =index_table:module_code size: 1024 limit: 65536 entries: 240 =hash_table:export_list size: 4813 used: 3261 objs: 5483 depth: 8 =index_table:export_list size: 6144 limit: 65536 entries: 5483 =hash_table:secondary_export_table size: 97 used: 0 objs: 0 depth: 0 =hash_table:process_reg size: 97
Here is our setup
- ejabberd 2.0.1 (build from source with flash policy file serving patch) - enabled ODBC authentication support - Linux CentOS (Kernel ver: 2.6.18-164.10.1.el5PAE) - following ejabberd modules are enabled mod_muc_log, mod_http_bind, web_admin, mod_http_poll mod_http_hello (custom module to support heart beat on monitoring port 5280 for LVS load balancer) - clustered setup running with two nodes - two virtual hosts, one host with auth_method odbc and another with anonymous - mod_muc with history_size set to 100
Here is the ejabberd startup parameters options value
# define default configuration POLL=true SMP=auto ERL_MAX_PORTS=1000000 ERL_PROCESSES=50000000 ERL_MAX_ETS_TABLES=140000 ERL_ASYNC_THREAD_CNT=256
Here is the mnesia db node status
use fallback at restart = false running db nodes = ['ejabberd@first.abc.com','ejabberd@debug.abc.com'] stopped db nodes = ['ejabberd@third.abc.com','ejabberd@second.abc.com','ejabberd@fourth.abc.com'] master node tables = [] remote = [acl,anonymous,caps_features,config,disco_publish, http_bind,iq_response,irc_custom,last_activity, local_config,mod_register_ip,motd,motd_users, muc_online_room,muc_registered,muc_room,offline_msg, privacy,private_storage,pubsub_item,pubsub_node, pubsub_state,roster,route,s2s,session,sr_group,sr_user, user_caps,user_caps_resources,vcard,vcard_search] ram_copies = [] disc_copies = [schema] disc_only_copies = [] [] = [user_caps_resources,user_caps,local_config,mod_register_ip] [{'ejabberd@debug.abc.com',disc_copies}, {'ejabberd@first.abc.com',disc_copies}] = [schema] [{'ejabberd@first.abc.com',disc_copies}] = [config,privacy,irc_custom, roster,sr_user,motd,acl, sr_group,vcard_search, motd_users,muc_room, pubsub_state, muc_registered,pubsub_node] [{'ejabberd@first.abc.com',disc_only_copies}] = [last_activity, offline_msg, disco_publish,vcard, private_storage, pubsub_item] [{'ejabberd@first.abc.com',ram_copies}] = [http_bind,route,s2s, anonymous,caps_features, session,iq_response, muc_online_room] 4 transactions committed, 0 aborted, 0 restarted, 1 logged to disc 0 held locks, 0 in queue; 0 local transactions, 0 remote
Here is the linux parameters
[root@www35 ejabberd]# ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 212992 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 65000 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 212992 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited
The crash says that ejabberd try to allocate 5MB (using old_heap) and memory could not be found, but our linux machine was 12GB of memory capacity with 8GB was available at the time of crash, this indicate that the system resources are highly enough to handle much more thousands of more concurrent connections. So I strong hypothesis there might be some configuration setting limit not allowing to use available memory. I would be happy to solve this problem or someone throws the light on what might be going wrong in our end to make it 100% stable for our chat traffic.
Also I does not have hands on experience working with erlang programming language to fiddle around with the code, but nothing stop me to learn in near future. If need more data would be able to provide.
Re: Ejabberd 2.0.1 crashes when process memory reach
$ erl
1> erlang:system_info(wordsize).
8
Here 8 means 64-bit Erlang is used. This is important because of limitation of memory allocation on 32-bit platforms.
Thanks for response,
Thanks for response, Zinid.
erlang:system_info(wordsize) returned 4,
we use Linux 32 bit multi-core processor machine with ejabberd 2.0.1 and erlang R12B-5 build from the source.
Does Linux has the 2GB process memory limitation something similar in windows? Also the message queue length is zero across the crash dump
Do we can run 64-bit erlang in 32 linux machine?
Re: Ejabberd 2.0.1 crashes when process memory
Does Linux has the 2GB process memory limitation something similar in windows?
This is a limitation of Erlang: it is unable to allocate more than 2-2.5Gb of heap on 32-bit machine.
Do we can run 64-bit erlang in 32 linux machine?
I think no.
Compiled Erlang on 64bit
Compiled Erlang on 64bit Linux machine and not able to reproduce the issue using jabsimul stress test tool...the memory consumption of ejabberd process crossed more than 2GB even sometime 3GB with heavy load...require some more testing, will update soon
Thanks once again, Zinid
Erlang 2GB memory limit on 32bit processor
I'm wondering why this limitation is maintained in Erlang?
why not relaxed this limitation based on OS? since linux support more than 2GB address space should not the erlang runtime depend on it?
Re: Erlang 2GB memory limit on 32bit processor
I'm wondering why this limitation is maintained in Erlang?
why not relaxed this limitation based on OS? since linux support more than 2GB address space should not the erlang runtime depend on it?
No idea. You better ask Erlang developers about that.