LHC Computer Grid - Rollout
> [mailto:[log in to unmask]] On Behalf Of Fotis Georgatos
said:
> During the submissions, I managed to confirm that the RBs
> have a breaking
> point at around 32-64 concurrent submissions of jobs and a
> throughput of 25 jobs/minute.
Is the throughput the number of jobs you can submit, or the rate at
which they run? You can often submit jobs faster than they can be
processed, especially if there are complex input-file matches.
> All of these tasks failed with "Job RetryCount (3) hit" error.
That isn't a real error, it means that the system tried to run the job
three times, maybe at three different places, and they all failed. You
can see the individual failure reasons if you do
edg-job-get-logging-info -v 2. It may also be that some of the jobs you
are counting as OK actually failed somewhere and retried. If you want to
measure the underlying error rate it may be worth turning off the
retries.
Stephen
|