Hi all,
I'm draining one of our creamCE (gLite 3.2).
I use glite_cream_load_monitor --show to see how many pending jobs it
has, and since yesterday, its numbers is not decreasing anymore,
/opt/glite/bin/glite_cream_load_monitor --show
Threshold for Load Average(1 min): 40 => Detected value for Load Average(1 min): 0.57
Threshold for Load Average(5 min): 40 => Detected value for Load Average(5 min): 0.74
Threshold for Load Average(15 min): 20 => Detected value for Load Average(15 min): 0.91
Threshold for Memory Usage: 95 => Detected value for Memory Usage: 77.29%
Threshold for Swap Usage: 95 => Detected value for Swap Usage: 0.00%
Threshold for Free FD: 500 => Detected value for Free FD: 382296
Threshold for tomcat FD: 800 => Detected value for Tomcat FD: 293
Threshold for FTP Connection: 60 => Detected value for FTP Connection: 1
Threshold for Number of active jobs: -1 => Detected value for Number of active jobs: 98
Threshold for Number of pending commands: -1 => Detected value for Number of pending commands: 106
I've checked qstat output and no job from that CEs are still in queue,
so where are those jobs?
So, taking a look into creamDB JOB_MANAGEMENT I could see how many
pending commands are still remaining:
select * from JOB_MANAGEMENT;
+---------+-----------+-------------+---------------+----------------+---------------+
| id | commandId | isScheduled | priorityLevel | commandGroupId | executionMode |
+---------+-----------+-------------+---------------+----------------+---------------+
| 205666 | 205666 | 1 | 0 | CREAM780291039 | S |
[...]
| 1048671 | 1048671 | 0 | 0 | CREAM780291039 | S |
| 1049362 | 1049362 | 0 | 0 | CREAM780291039 | S |
+---------+-----------+-------------+---------------+----------------+---------------+
109 rows in set (0.01 sec)
and I guess that id from JOB_MANAGEMENT and id from command are
realtion keys, so I could see what pending commands are there.
select * from command where id='1022895';
+---------+-----------+----------------+---------------------------------+------------+---------------+-----------------+--------+---------------------+---------------------+------------------------+---------------------+
| id | name | category | description | statusType | failureReason | cmdExecutorName | userId | startSchedulingTime | startProcessingTime | executionCompletedTime | creationTime |
+---------+-----------+----------------+---------------------------------+------------+---------------+-----------------+--------+---------------------+---------------------+------------------------+---------------------+
| 1022895 | JOB_PURGE | JOB_MANAGEMENT | Cancelled by CREAM's job purger | 2 | NULL | BLAHExecutor | ADMIN | 2012-08-13 07:35:05 | NULL | NULL | 2012-08-13 07:35:05 |
+---------+-----------+----------------+---------------------------------+------------+---------------+-----------------+--------+---------------------+---------------------+------------------------+---------------------+
1 row in set (0.00 sec)
All the jobs excpet 2 have that descripiton, and dates are between
2012-07-25 and 2012-08-16 (today). I only have 2 commandGroupId .
So, I have some questions....
Are all those jobs related to a single CREAMID? (In other words,
commandGrooupID is a CREAM_ID?)
how may I drain those pending commands?
If commandgroupid is not CREAM_ID, how may I know CREAMID from each
job ?
TIA,
Arnau
|