Hi. I have two ejabberd 1.1.2 servers clustered together with the documentation I've found on the website here and in the manual. Things seem to work fine for a while but it seems like every night (possibly when they're doing nightly maintenance) the servers disconnect from eachother. The main one logs this
=ERROR REPORT==== 2006-12-05 00:16:45 ===
** Node ejabberd@server2 not responding **
** Removing (timedout) connection **
and the backup
=ERROR REPORT==== 2006-12-05 00:16:54 ===
** Node ejabberd@server1 not responding **
** Removing (timedout) connection **
while it's possible the vpn between them is having a hiccup at that point I know it's not going down long because no other processes complain at all. When I connect to the erl session on server2 it is able to ping server1 without a problem (and vice versa) and when I restart server2 then everything is fine again for a while
So, is there a way I can find out what's causing the problem in the first place?
And, is there a way I can get mnesia / ejabberd to continue to attempt the connection until it reconnects?