Print

Print


El 16/06/17 a las 15:39, Matthew Vernon escribió:
> Hi,
>
> I've been bugging Jisc about this bug for ages, so thought I'd try
> looking at it myself. I didn't make much headway, but perhaps enough to
> let someone who knows what they're doing fix it :)
>
> The failure mode is that tids processes do not die, and instead sit
> around chewing 100% CPU - over time you have enough of these to bring
> your IdP to its knees.

So, does this only happen when you are shutting the tids process down? 
(e.g. Ctrl-C?).

> We've bodged round this by having a cron job do
> system tids restart ever 2 hours :(
>
> strace on a spinning tids produces no output (suggesting no system calls
> are being made), gdb (with moonshot-trust-router-dbg installed) always
> looks roughly like:
>
> Attaching to program: /usr/bin/tids, process 2374
> [New LWP 2375]
> [New LWP 2376]
> [New LWP 2377]
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> 0x00007fac5b1804c2 in log4shib::Category::getChainedPriority() const ()
>     from /usr/lib/x86_64-linux-gnu/liblog4shib.so.1
> (gdb) bt
> #0  0x00007fac5b1804c2 in log4shib::Category::getChainedPriority() const ()
>     from /usr/lib/x86_64-linux-gnu/liblog4shib.so.1
> #1  0x00007fac5b17fc79 in log4shib::Category::isPriorityEnabled(int)
> const ()
>     from /usr/lib/x86_64-linux-gnu/liblog4shib.so.1
> #2  0x00007fac5b1809fc in log4shib::Category::info(char const*, ...) ()
>     from /usr/lib/x86_64-linux-gnu/liblog4shib.so.1
> #3  0x00007fac5d9a63c2 in shibsp::SPConfig::term() ()
>     from /usr/lib/x86_64-linux-gnu/libshibsp.so.6
> #4  0x00007fac5d9a6a88 in shibsp::SPInternalConfig::term() ()
>     from /usr/lib/x86_64-linux-gnu/libshibsp.so.6
> #5  0x00007fac5dd6fd7e in shibresolver::ShibbolethResolver::term() ()
>     from /usr/lib/x86_64-linux-gnu/libshibresolver.so.1
> #6  0x00007fac5e1c9189 in gssEapLocalAttrProviderFinalize ()
>     from /usr/lib/x86_64-linux-gnu/gss/mech_eap.so
> #7  0x00007fac5e1c1174 in ?? () from
> /usr/lib/x86_64-linux-gnu/gss/mech_eap.so
> #8  0x00007fac5f5ebff8 in __run_exit_handlers (status=status@entry=0,
>      listp=0x7fac5f9755f8 <__exit_funcs>,
>      run_list_atexit=run_list_atexit@entry=true) at exit.c:82
> #9  0x00007fac5f5ec045 in __GI_exit (status=status@entry=0) at exit.c:104
> #10 0x00000000004059b7 in tids_accept (tids=0x190e200, listen=<optimized
> out>)
>      at tid/tids.c:485
> #11 0x0000000000405dec in tids_start (tids=tids@entry=0x190e200,
>      req_handler=req_handler@entry=0x403dc0 <tids_req_handler>,
>      auth_handler=auth_handler@entry=0x403d70 <auth_handler>,
>      hostname=<optimized out>, port=port@entry=12309,
>      cookie=cookie@entry=0x1904dc0) at tid/tids.c:546
> #12 0x0000000000403a94 in main (argc=<optimized out>, argv=<optimized out>)
>      at tid/example/tids_main.c:389
>
> tid/tids.c:485 is
>      exit(0); /* exit to kill forked child process */
>
> ...so it appears to be a bug in something's exit handlers?
> getChainedPriority does have a loop in it:
>
>          const Category* c = this;
>          while(c->getPriority() >= Priority::NOTSET) {
>              c = c->getParent();
>          }
>
> ...which makes me wonder if something is being incorrectly initialised,
> but I'm rather clutching at straws here.
>
> Debian/Ubuntu don't ship a log4shib library with debugging symbols
> installed.
>
> I then installed moonshot-gss-eap-dbg and the problem seems much slower
> to recur (to the point that I'd thought it had caused the problem to
> entirely go away); now a bt looks like:
>
> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> 0x00007fd74ece84c2 in log4shib::Category::getChainedPriority() const ()
>     from /usr/lib/x86_64-linux-gnu/liblog4shib.so.1
> (gdb) bt
> #0  0x00007fd74ece84c2 in log4shib::Category::getChainedPriority() const ()
>     from /usr/lib/x86_64-linux-gnu/liblog4shib.so.1
> #1  0x00007fd74ece7c79 in log4shib::Category::isPriorityEnabled(int)
> const ()
>     from /usr/lib/x86_64-linux-gnu/liblog4shib.so.1
> #2  0x00007fd74ece89fc in log4shib::Category::info(char const*, ...) ()
>     from /usr/lib/x86_64-linux-gnu/liblog4shib.so.1
> #3  0x00007fd75150e3c2 in shibsp::SPConfig::term() ()
>     from /usr/lib/x86_64-linux-gnu/libshibsp.so.6
> #4  0x00007fd75150ea88 in shibsp::SPInternalConfig::term() ()
>     from /usr/lib/x86_64-linux-gnu/libshibsp.so.6
> #5  0x00007fd7518d7d7e in shibresolver::ShibbolethResolver::term() ()
>     from /usr/lib/x86_64-linux-gnu/libshibresolver.so.1
> #6  0x00007fd751d310f7 in gss_eap_shib_attr_provider::finalize ()
>      at util_shib.cpp:481
> #7  0x00007fd751d31189 in gssEapLocalAttrProviderFinalize (
>      minor=minor@entry=0x7ffeac0d31f4) at util_shib.cpp:550
> #8  0x00007fd751d29174 in (anonymous
> namespace)::finalize_class::~finalize_class (this=<optimized out>,
> __in_chrg=<optimized out>) at util_attr.cpp:101
> #9  0x00007fd753153ff8 in __run_exit_handlers (status=status@entry=0,
>      listp=0x7fd7534dd5f8 <__exit_funcs>,
>      run_list_atexit=run_list_atexit@entry=true) at exit.c:82
> #10 0x00007fd753154045 in __GI_exit (status=status@entry=0) at exit.c:104
> #11 0x00000000004059b7 in tids_accept (tids=0x968200, listen=<optimized
> out>)
>      at tid/tids.c:485
> #12 0x0000000000405dec in tids_start (tids=tids@entry=0x968200,
>      req_handler=req_handler@entry=0x403dc0 <tids_req_handler>,
>      auth_handler=auth_handler@entry=0x403d70 <auth_handler>,
>      hostname=<optimized out>, port=port@entry=12309,
>      cookie=cookie@entry=0x95edc0) at tid/tids.c:546
> #13 0x0000000000403a94 in main (argc=<optimized out>, argv=<optimized out>)
>      at tid/example/tids_main.c:389
>
> This is an Ubuntu Xenial system, but I've seen the runaway-tids problem
> basically since I started looking at the moonshot pilot back at my
> previous job (where we were running Debian).

Which installation instructions have you followed? I failed to find 
official ones for Ubuntu Xenial.
Not saying that is the problem, but wondering whether incorrect 
libraries might be related.

Regards,
Alejandro

>
> Regards,
>
> Matthew
>
>