Cluster startup problems

Hello,

I'm running ejabberd on Linux (CentOs) and ejabberd 2.0.0. I'm trying to setup a cluster of two ejabberd-servers. I'm following the instructions in http://www.process-one.net/docs/ejabberd/guide_en.html#ejabberdctl. I just cannot get the cluster working.

1. Do I need to set up the mnesia configuration in chapter 6.2 in both ejabberd servers? The chapter ends in words "You can repeat these steps for other machines supposed to serve this domain.", but this doesn't really tell me whether I need to do the conf in both nodes or not. If I do the configuration only in one node, the Web interface of the second node doesn't show anything about the first node list, so I presume the configuration needs to be done in both nodes, right? But, if I do the configuration in both nodes, the second node crashed immediately when I start it up and creates some dump files in bin (prefixed by MnesiaCore.ejabberd@hostname).

2. Is there a step-by-step tutorial somewhere on how to setup the tutorial? The manual step 4 suggest that you _can_ change the database table replication configurations, but it doesn't say anything concrete whether anything _must_ be done or not in order to get the cluster working.

I can provide step by step instructions on how I've set up the ejabberd servers, if that would help. I've tried many times from scratch and the result always is the same.

r,
J

An ejabberd server is already the first node in a cluster

Any ejabberd server can be considered a cluster of just one node, right?

The section '6.2 Clustering Setup' in the ejabberd Guide explains how to add more nodes to that existing cluster. So, that section starts saying:

> Suppose you already configured ejabberd on one machine named (first),
> and you need to setup another one to make an ejabberd cluster.

If you take that consideration into account, you will agree that all the steps mentioned in the section are clearly to be performed in the second node.

Finally, the Guide says:

> You can repeat these steps for other machines supposed to serve this domain.

This means that you can add a third, a forth and more nodes to your existing cluster, following the same steps you followed to add the second node.

As you have already tried those steps several times, you probably have noticed that setting an ejabberd cluster is just a matter of configuring the Mnesia database to replicate some tables across the nodes.

Cluster-nodes cannot see each other / node crashes at startup

badlop wrote:

Any ejabberd server can be considered a cluster of just one node, right?

Makes sense. Thanks for the clarification.

badlop wrote:

The section '6.2 Clustering Setup' in the ejabberd Guide explains how to add more nodes to that existing cluster. So, that section starts saying:

> Suppose you already configured ejabberd on one machine named (first),
> and you need to setup another one to make an ejabberd cluster.

If you take that consideration into account, you will agree that all the steps mentioned in the section are clearly to be performed in the second node.

Finally, the Guide says:

> You can repeat these steps for other machines supposed to serve this domain.

This means that you can add a third, a forth and more nodes to your existing cluster, following the same steps you followed to add the second node.

As you have already tried those steps several times, you probably have noticed that setting an ejabberd cluster is just a matter of configuring the Mnesia database to replicate some tables across the nodes.

The problem is that I don't have experience on setting up a Mnesia database replication.

I still have the problem of Mnesia crashing. Here's the detailed description of the installation process:

On first:

Install from ejabberd-2.0.0-linux-x86-installer.bin, no special settings.
Modify the file conf/ejabberdctl.cfg and set ERLANG_NODE=ejabberd@first
bin/ejabberdctl start

On second:

Install from ejabberd-2.0.0-linux-x86-installer.bin, the same settings as in first.
Modify the file conf/ejabberdctl.cfg and set ERLANG_NODE=ejabberd@second
./bin/erl -sname ejabberd -mnesia extra_db_nodes "['ejabberd@first']" -s mnesia -> mnesia:info(). shows the both nodes
mnesia:change_table_copy_type(schema, node(), disc_copies).
init:stop().
./bin/ejabberdctl start

Now on first I can see second in the nodes list, but as stopped node. In second I only can see the second (as running node). If I access first during the steps described above, during the "./bin/erl -sname ejabberd -mnesia..." command, I can see second as a running node. If I stop both nodes and start the node "second" first and then the node "first", the node "first" crashed immediatelly causing a file bin/ MnesiaCore.ejabberd@xyzzy (or similar).

i just posted the _exakt_

i just posted the _exakt_ same problem here.
unfortunately i have nothing to add to a soulution, just pointing
you there in case someone gives an answer there...

Got it working

The fundamental problem is that when you run the 'erl' command, it will create a new (as far as I understood) Mnesia database in the working directory. The name of the database-directory will be Mnesia.host@domain. When you start ejabberd it will try to use the database in the database directory, which is not the syncronized database.

My suggestion as a the fix to the installation instructions:

Do as told in the instructions, but after step 5 issue commands, where node@host is the database directory you have.

rm -rf database/node@host
mv Mnesia.node@host database/node@host

The first command removes the non-syncronized database and the second command puts the synchronized database in place of the removed database.

The fix even makes sense. The nicer fix would naturally be that the 'erl' command wouldn't create a new database, but update the existing. No idea how to do that.

Thanks for help for all, I really spent a lot of time with this one. Hopefully the issue is now really fixed and no surprises will be found out later.

fix for dc_im

see this blog to fix your issue. It is located in the section Set up the Second Node, step 7.

I fixed my own problem (had

I fixed my own problem (had the exact same one).

It seems that ejabberd does not find the appropriate database that
ist created by the mnesia sync. Here is what i did:

1. cd /opt/ejabberd/bin/
rm -rf Mnesia*

2. cd ../databases/
rm -rf *

3. vim ../conf/ejabberdctl.cfg
search for "database" and put a "Mnesia." in front of $ERLANG_NODE so
it _exactly_ matches the name of your node (may have to edit some stuff
at the beginning of the file to make $ERLANG_NODE represent the name)

4. (still in databases)
../bin/erl -name ejabberd@second -mnesia extra_db_nodes "['ejabberd@first']" -s mnesia
mnesia:change_table_copy_type(schema, node(), disc_copies).
It will say something about "already exists"...ignore that.
q().

5. cd ../bin/
./start

works! (for me at least)

Couldn't reproduce the instructions

steam wrote:

I fixed my own problem (had the exact same one).

It seems that ejabberd does not find the appropriate database that
ist created by the mnesia sync. Here is what i did:

You mean, you did this _after_ making the steps mentioned in the installation guide?

steam wrote:

1. cd /opt/ejabberd/bin/
rm -rf Mnesia*

2. cd ../databases/
rm -rf *

3. vim ../conf/ejabberdctl.cfg
search for "database" and put a "Mnesia." in front of $ERLANG_NODE so
it _exactly_ matches the name of your node (may have to edit some stuff
at the beginning of the file to make $ERLANG_NODE represent the name)

You mean there's a string database in the file? I don't see there the string.... So what do you put exactly in ERLANG_NDOE? If the line would be e.g. ejabberd@second, where do you exactly put "Mnesia."?

steam wrote:

works! (for me at least)

I couldn't reproduce, since I didn't get all your instructions. So, pls if you can be more exact, we could see that if your solution works also for me. I guess if it helps me, it'll help a lot of other ppl too.

r,
J

Syndicate content