ejabberd-2.0.0 otp_src_R11B-5 crypto_drv Solaris 10 crypto:start() problem understanding

I'm trying to understand the problem:

(1)

> /opt/xmpp/bin/erl
Erlang (BEAM) emulator version 5.5.5 [source] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V5.5.5  (abort with ^G)
1> crypto:start().
ok
2> sh: crypto_drv: not found
.....
2> q().
ok
3>

(2)

> sbin/ejabberdctl live
.....
=PROGRESS REPORT==== 23-Apr-2008::18:51:06 ===
          supervisor: {local,crypto_sup}
             started: [{pid,<0.183.0>},
                       {name,crypto_server},
                       {mfa,{crypto_server,start_link,[]}},
                       {restart_type,permanent},
                       {shutdown,2000},
                       {child_type,worker}]

=PROGRESS REPORT==== 23-Apr-2008::18:51:06 ===
         application: crypto
          started_at: ejabberd@localhost
sh: crypto_drv: not found

=ERROR REPORT==== 23-Apr-2008::18:51:06 ===
** Generic server crypto_server terminating
** Last message in was {'EXIT',#Port<0.297>,normal}
** When Server state == {#Port<0.297>,[]}
** Reason for termination ==
** {port_died,normal}

=CRASH REPORT==== 23-Apr-2008::18:51:06 ===
  crasher:
    pid: <0.183.0>
    registered_name: crypto_server
    error_info: {port_died,normal}
    initial_call: {gen,init_it,
                      [gen_server,
                       <0.182.0>,
                       <0.182.0>,
                       {local,crypto_server},
                       crypto_server,
                       [],
                       []]}
    ancestors: [crypto_sup,<0.181.0>]
    messages: []
    links: [<0.182.0>]
    dictionary: []
    trap_exit: true
    status: running
    heap_size: 610
    stack_size: 21
    reductions: 594
  neighbours:

=SUPERVISOR REPORT==== 23-Apr-2008::18:51:06 ===
     Supervisor: {local,crypto_sup}
     Context:    child_terminated
     Reason:     {port_died,normal}
     Offender:   [{pid,<0.183.0>},
                  {name,crypto_server},
                  {mfa,{crypto_server,start_link,[]}},
                  {restart_type,permanent},
                  {shutdown,2000},
                  {child_type,worker}]

......
=INFO REPORT==== 23-Apr-2008::18:51:06 ===
    application: crypto
    exited: shutdown
    type: temporary
.....

(ejabberd@localhost)1>

(3) if I try to start from live...

(ejabberd@localhost)1> crypto:start().
ok
(ejabberd@localhost)2>
=PROGRESS REPORT==== 23-Apr-2008::19:00:06 ===
          supervisor: {local,crypto_sup}
             started: [{pid,<0.359.0>},
                       {name,crypto_server},
                       {mfa,{crypto_server,start_link,[]}},
                       {restart_type,permanent},
                       {shutdown,2000},
                       {child_type,worker}]

=PROGRESS REPORT==== 23-Apr-2008::19:00:06 ===
         application: crypto
          started_at: ejabberd@localhost

(ejabberd@localhost)2>

Can somebody explain this kind of wonder? BTW, I looked at the process of loadig with truss and fond, there is no error in the syscall, the crypto_drv.so will be found and loaded.

It looks like erlang problem, but why can ejabberd start crypto server over the live or debug? It's probably a time to build a workaround, but I hope to find a solution.

rgds
Vladi

Is it really the same Erlang installation in all cases?

In case 1 you start erlang directly: /opt/xmpp/bin/erl

In case 2 you start erlang with a system script? /sbin/ejabberdctl live
You should check what 'erl' is started by that script. I guess the same that 1)

In case 3 you don't tell how you started erlang. And what erlang system is used in that case.

Do you have only 1 Erlang system installed in the machine?

How did you install that ejabberd that succeeds in case 3: from source code or with some binary installer?

Yes, it is the same erlang installation

It was probably not very clear explained. The erlang is installed only once.

Case (1) I'm starting erl directly and can't load
Case (2) I'm starting erl with ejabberd through a ejabberdctl live and the library is not loaded automatically, same error
Case (3) the ejabberd is started as you see from case (2), then I try crypto:start(). from this live session and it suddenly works!

I know, there is a workaround: recompile the crypto_drv.so. The references are to Linux mainly with 64bit, but in my case, it's not a Linux and I have no problem with 64/32bit libraries. But in spite of everything, this workaround is working for me too:

/usr/sfw/bin/gcc -shared -o ../priv/lib/sparc-sun-solaris2.10/crypto_drv.so ../priv/obj/sparc-sun-solaris2.10/crypto_drv.o /usr/sfw/lib/libcrypto.so

and this library can be loaded in case (1) and (2) now. I think, it's more generic as Solaris specific error. And I only try to understand *WHY* (3) working? :-)

CY,
Vladi

P.S. It's probably not a very bad idea to reference to this "behavior" in the README...

Syndicate content