Dear Maarten, Maarten Litmaath wrote: > Hallo Christoph, > >> It seems that the WMS recovered itself (being in >> drain mode) over the weekend. The WMS is full of Conder jobs in state >> "H" (hold). Do they harm? Some are weeks old already. > > Normally held jobs do not harm, but the latest WMS version has an issue > for which the admin may need to intervene occasionally: > > https://savannah.cern.ch/bugs/?69841 This looks rather similar to picture that we saw on Friday afternoon. Actually I also removed a lot of held jobs. Perhaps that did the trick to recover WMS. > A cleanup cron job for held jobs is included in this bug: > > https://savannah.cern.ch/bugs/?70401 > > The grace period of 1 week probably should be lowered to 1 day, > or even just a few hours... We will try the cron job. >> Another question, perhaps someone know the answer. Trying to get some >> understanding of the flow of a job through the WMS, I tried to follow a >> job that goes to a CREAM-CE. Are those jobs supposed to showup in the >> list of jobs listed with conder_q? > > No. On a WMS the jobs for CREAM are handled by ICE, while jobs sent to > LCG-CE or ARC-CE instances are handled by Condor-G: > > https://twiki.cern.ch/twiki/bin/view/EGEE/EGEEgLiteJobSubmissionSchema > > To see ICE details one can use /opt/glite/bin/queryDb on the WMS. > The "-h" option shows how. Thanks for the hints. The picture is rather busy, but it is clear if you know that ICE does not deal with Condor-G internally. (I did know before...) Best wishes, Christoph