Hi
It seems that ideally the job would report why it did not match so that the site admin can proactively do something about it if they wanted to do so. That would save some effort in the cases where it should have worked. But the current approach of the requirements being sent (by the user) after the event via GGUS tickets is generally accepted.
Is there any wider support for Stuart to follow up with his wrapper script suggestion? In case you missed it:
"... it would be good if there was a way to get the WMS to describe the details of the matching process better, for diagnosing these things. It wouldn't be too tricky to write a wrapper around job-list-match, taking each component of the Requirements at a time, and counting the matching resources for each stanza in it, and that might give an idea of where the problem lies. If there's interest, I can see if I can whip something up?"
Jeremy
On 19 May 2011, at 11:31, Stephen Burke wrote:
> Testbed Support for GridPP member institutes [mailto:TB-
>> [log in to unmask]] On Behalf Of Stephen Jones said:
>> It would be nice to see the list of stuff a job needs, so that I can
>> provide the stuff on our CEs.
>
> This doesn't really seem to be the right way to look at it. There's a large number of CEs in the Grid and a large number of jobs being submitted; there is no reason to expect your CE to match all or even most of them. If you're supposed to provide something specific to a given user community, e.g. some piece of installed software, they should be asking you directly, e.g. via a GGUS ticket. If a user has a problem with a particular job that they expected to go to your site they should be able to give you the JDL as part of the GGUS ticket.
>
> For Nagios tests the situation is similar, the test should provide enough information to let sites figure out what went wrong. But in the vast majority of cases, if you get "no compatible resources" from a nagios test or a simple test job it will mean that your CE is missing completely in the bdii, so that's usually the first thing to check.
>
>> Test jobs direct to the a site doesn't arrive due to "no compatible
>> resources". To make them arrive, I need to know why, but the info is
>> not available to me.
>
> Bear in mind that "no compatible resources" isn't per se an error, just a statement. If there are no CEs that happen to match the JDL then that's just the way it is. And the matching is in principle against every site in the Grid, so if you did regard it as an error it would be an error for every site! Conversely a job may happily match and run *somewhere* but still be failing to match your site due to some bug.
>
> The nagios tests are different, because they are specifically intended to be a test for your site. If you feel that they don't provide enough information you should feed that back to the test developers.
>
> Stephen
|