Replacing old sessions with same resource and its impact on unavailable presence

* Here is the scenario, correct me if my assumptions are wrong:

* user1@d1 connects with resource r1, finishes resource bind, session bind, roster fetch and initial presence update. One entry for its c2s process is made in "session" table, during session binding phase.

* user1@d1 from another client connects with resource r1, finishes resource binding and sends session bind packet, then...

* "ejabberd_sm:open_session" get called for the 2nd connection. Which internally looks up session table, finds 1st connection with same resource and then sends "replace" message to it. And gives ack to 2nd connection that resource binding is successful.

* The problem here is that the "replace" message might not get processed for 1st connection before the initial presence broadcast for 2nd connection. And contacts of this user might incorrectly see him as offline.

* I have not seen any code that does some explicit sync to avoid such problem. Is this a known issue?

p.s.: if directed unavailable presence broadcast of the 1st connection happens after 2nd connection joins a conf room, than conf-room might mistakenly log off 2nd connection.

let me know if further clarification is needed to understand the problem.

Yes, problem exists

Jigar Gosar wrote:

The problem here is that the "replace" message might not get processed for 1st connection before the initial presence broadcast for 2nd connection.

I have not seen any code that does some explicit sync to avoid such problem.

In general, that problem doesn't happen; otherwise it would have been already reported in the bug tracker. But in rare cases it might happen. I didn't find any code that ensures the presence ordering you refer.

Jigar Gosar wrote:

And contacts of this user might incorrectly see him as offline.

p.s.: if directed unavailable presence broadcast of the 1st connection happens after 2nd connection joins a conf room, than conf-room might mistakenly log off 2nd connection.

Right. To allow easier reproduction and debugging of the problem, I apply this patch:

--- a/src/ejabberd_c2s.erl
+++ b/src/ejabberd_c2s.erl
@@ -1373,6 +1373,8 @@ terminate(_Reason, StateName, StateData) ->
                      StateData#state.server,
                      StateData#state.resource,
                      "Replaced by new connection"),
+                   timer:sleep(5000),
+                   ?INFO_MSG("Broadcast unavailable presence of old~n~p", [StateData#state.socket]),
                    presence_broadcast(
                      StateData, From, StateData#state.pres_a, Packet),
                    presence_broadcast(
@@ -1836,6 +1838,7 @@ presence_broadcast_to_trusted(StateData, From, T, A, Packet) ->


 presence_broadcast_first(From, StateData, Packet) ->
+    ?INFO_MSG("Broadcast initial presence of new socket~n~p", [StateData#state.socket]),
     ?SETS:fold(fun(JID, X) ->
                       ejabberd_router:route(
                         From,

First I login to user1@localhost/home and user2@localhost/work, which are mutual contacts. Then I login with another client to user1@localhost/home.

This is logged by ejabberd:

=INFO REPORT==== 7-Apr-2010::23:53:04 ===
I(<0.319.0>:ejabberd_c2s:1841) : Broadcast initial presence of new socket
{socket_state,gen_tcp,#Port<0.3891>,<0.318.0>}

=INFO REPORT==== 7-Apr-2010::23:53:08 ===
I(<0.313.0>:ejabberd_c2s:1377) : Broadcast unavailable presence of old
{socket_state,gen_tcp,#Port<0.3853>,<0.312.0>}

This is what user2 receives:

<presence from='user1@localhost/home'
	to='user2@localhost/work'>
  <show>xa<show>
  <priority>8<priority>
<presence>

<presence from='user1@localhost/home'
	to='user2@localhost/work'
	type='unavailable'>
  <status>Replaced by new connection<status>
<presence>

Consequently, the user2 client shows user1 as offline, but in reality he is online!

Jigar Gosar wrote:

Is this a known issue?

I haven't found any ticket about this, so this is a newly found problem. If you have nothing more to add in the next days, I'll submit a new ticket next week.

If implementing an order-verification feature is too difficult, the simple workaround would be to implement Option to disallow new session if resource conflict, which doesn't have that problem. And enable that option in the default ejabberd.cfg.

Thanks for the detailed

Thanks for the detailed response. This was a great help in verifying that I do understand ejabberd code correctly.

Also the workaround of keeping older connection alive for a given resource is something I had never considered.

Thanks.

Syndicate content