Print

Print


The grep:

(of course this example does go one step further...)
/opt/glite/var/log/glite-ce-cream.log:13 Oct 2010 16:13:51,234 INFO  
org.glite.ce.creamapi.jobmanagement.cmdexecutor.AbstractJobExecutor 
(AbstractJobExecutor.java:826) - (Worker Thread 11) 
REMOTE_REQUEST_ADDRESS=145.100.5.194; 
USER_DN=/O=dutchgrid/O=users/O=sara/CN=Maarten Hendrik van Ingen; USER_FQAN={ 
/pvier/Role=NULL/Capability=NULL; /pvier/infra/Role=NULL/Capability=NULL; }; 
CMD_NAME=JOB_START; CMD_CATEGORY=JOB_MANAGEMENT; CMD_STATUS=PROCESSING; 
commandName=JOB_START; cmdExecutorName=BLAHExecutor; 
userId=_O_dutchgrid_O_users_O_sara_CN_Maarten_Hendrik_van_Ingen_pvier_Role_NULL_Capability_NULL; 
jobId=CREAM598295305; status=PROCESSING;
/opt/glite/var/log/glite-ce-cream.log:13 Oct 2010 16:13:51,287 INFO  
org.glite.ce.creamapi.jobmanagement.cmdexecutor.AbstractJobExecutor 
(AbstractJobExecutor.java:2094) - (Worker Thread 11) JOB CREAM598295305 STATUS 
CHANGED: REGISTERED => PENDING [localUser=pvi032] 
[delegationId=ce2ca4874b98dd5f6b55c9e6b3b4a4a1f852d36c]
/opt/glite/var/log/glite-ce-cream.log.1:13 Oct 2010 15:47:27,553 INFO  
org.glite.ce.cream.jobmanagement.db.table.JobTable (JobTable.java:232) - 
(http-8443-Processor19) Job inserted. JobId = CREAM598295305
/opt/glite/var/log/glite-ce-cream.log.1:13 Oct 2010 15:47:27,661 INFO  
org.glite.ce.creamapi.jobmanagement.cmdexecutor.AbstractJobExecutor 
(AbstractJobExecutor.java:2094) - (http-8443-Processor19) JOB CREAM598295305 
STATUS CHANGED: -- => REGISTERED [localUser=pvi032] 
[delegationId=ce2ca4874b98dd5f6b55c9e6b3b4a4a1f852d36c]



/opt/glite/bin/glite_cream_load_monitor --show:


Threshold for Load Average(1 min): 40 => Detected value for Load Average(1 
min):  1.06
Threshold for Load Average(5 min): 40 => Detected value for Load Average(5 
min):  0.97
Threshold for Load Average(15 min): 20 => Detected value for Load Average(15 
min):  0.69
Threshold for Memory Usage: 95 => Detected value for Memory Usage: 17.57%
Threshold for Swap Usage: 95 => Detected value for Swap Usage: 0.00%
Threshold for Free FD: 500 => Detected value for Free FD: 2386973
Threshold for tomcat FD: 800 => Detected value for Tomcat FD: 269
Threshold for FTP Connection: 30 => Detected value for FTP Connection: 1
Threshold for Number of active jobs: -1 => Detected value for Number of active 
jobs: 5866
Threshold for Number of pending commands: -1 => Detected value for Number of 
pending commands: 431
Threshold for Disk Usage: 95% => Detected value for Partition / : 35%


SQL:

mysql> select c.name, c.creationTime from JOB_MANAGEMENT jm, command c where 
    -> jm.commandId =c.id order by c.creationTime limit 20;
+----------------+---------------------+
| name           | creationTime        |
+----------------+---------------------+
| SET_JOB_STATUS | 2010-10-13 14:33:02 | 
| SET_JOB_STATUS | 2010-10-13 14:33:09 | 
| SET_JOB_STATUS | 2010-10-13 14:33:59 | 
| SET_JOB_STATUS | 2010-10-13 14:34:00 | 
| SET_JOB_STATUS | 2010-10-13 14:35:54 | 
| SET_JOB_STATUS | 2010-10-13 14:37:11 | 
| SET_JOB_STATUS | 2010-10-13 14:39:57 | 
| PROXY_RENEW    | 2010-10-13 14:40:41 | 
| SET_JOB_STATUS | 2010-10-13 14:45:10 | 
| SET_JOB_STATUS | 2010-10-13 14:46:11 | 
| SET_JOB_STATUS | 2010-10-13 14:46:13 | 
| SET_JOB_STATUS | 2010-10-13 14:46:13 | 
| SET_JOB_STATUS | 2010-10-13 14:46:14 | 
| SET_JOB_STATUS | 2010-10-13 14:46:17 | 
| SET_JOB_STATUS | 2010-10-13 14:48:11 | 
| SET_JOB_STATUS | 2010-10-13 14:48:11 | 
| SET_JOB_STATUS | 2010-10-13 14:49:15 | 
| SET_JOB_STATUS | 2010-10-13 14:49:16 | 
| SET_JOB_STATUS | 2010-10-13 14:50:19 | 
| JOB_START      | 2010-10-13 14:53:37 | 
+----------------+---------------------+
20 rows in set (0.00 sec)


mysql> select c.name, count(c.name) from JOB_MANAGEMENT jm, command c where
    -> jm.commandId =c.id group by c.name;
+---------------------------+---------------+
| name                      | count(c.name) |
+---------------------------+---------------+
| COPY_NEW_PROXY_TO_SANDBOX |             3 | 
| JOB_PURGE                 |            61 | 
| JOB_START                 |           190 | 
| PROXY_RENEW               |           539 | 
| SET_JOB_STATUS            |           195 | 
+---------------------------+---------------+
5 rows in set (0.00 sec)


Cheers,
Maarten

On Wednesday 13 October 2010 16:14:49 Massimo Sgaravatto - INFN Padova wrote:
> What does:
> 
> grep -i 598295305 /opt/glite/var/log/glite-ce-cream.log*
> 
> report ?
> 
> Can you please issue this command on the CREAM CE as user tomcat:
> 
> /opt/glite/bin/glite_cream_load_monitor --show
> 
> ?
> 
> Is there a huge number of "Detected value for Number of pending
> commands" ?
> If so, can you please issue these mysql commands ?
> 
> use creamdb;
> select c.name, c.creationTime from JOB_MANAGEMENT jm, command c where
> jm.commandId =c.id order by c.creationTime limit 20;
> 
> select c.name, count(c.name) from JOB_MANAGEMENT jm, command c where
> jm.commandId =c.id group by c.name;
> 
>  			Cheers, Massimo
> 
> On Wed, 13 Oct 2010, Maarten van Ingen wrote:
> > Hi,
> > 
> > One of our creamce keeps jobs in registered state and many will not come
> > out of it.
> > Sometimes they will get through, but this could take some hours.
> > 
> > For example this job:
> > maarten$ glite-ce-job-submit -a -r
> > creamce.gina.sara.nl:8443/cream-pbs-infra ./gina
> > 2010-10-13 15:47:25,246 WARN - No configuration file suitable for
> > loading. Using built-in configuration
> > https://creamce.gina.sara.nl:8443/CREAM598295305
> > 
> > 
> > 
> > maarten$ glite-ce-job-status
> > https://creamce.gina.sara.nl:8443/CREAM598295305 2010-10-13 15:49:12,791
> > WARN - No configuration file suitable for loading. Using built-in
> > configuration
> > 
> > ******  JobID=[https://creamce.gina.sara.nl:8443/CREAM598295305]
> > 
> >        Status        = [REGISTERED]
> > 
> > When I have a look into the logging, all I can find is this:
> > root# grep 598295305 glite-ce-cream.log
> > 13 Oct 2010 15:47:27,553 INFO
> > org.glite.ce.cream.jobmanagement.db.table.JobTable (JobTable.java:232) -
> > (http-8443-Processor19) Job inserted. JobId = CREAM598295305
> > 13 Oct 2010 15:47:27,661 INFO
> > org.glite.ce.creamapi.jobmanagement.cmdexecutor.AbstractJobExecutor
> > (AbstractJobExecutor.java:2094) - (http-8443-Processor19) JOB
> > CREAM598295305 STATUS CHANGED: -- => REGISTERED [localUser=pvi032]
> > [delegationId=ce2ca4874b98dd5f6b55c9e6b3b4a4a1f852d36c]
> > 
> > 
> > The jdl used is the same as I use to submit to a wms (hence the
> > "Requirements" part):
> > 
> > Executable = "/bin/env";
> > Arguments = "| /bin/mail -s $(hostname) [log in to unmask]";
> > Stdoutput = "message.txt";
> > StdError = "stderror";
> > Requirements = other.GlueCEUniqueID ==
> > "creamce.gina.sara.nl:8443/cream-pbs- infra";
> > RetryCount=0;
> > ShallowRetryCount=0;
> > 
> > 
> > Also when I use bogus information for the requested queue it stays in the
> > REGISTERED state.:
> > 
> > maarten$ glite-ce-job-submit -a -r creamce.gina.sara.nl:8443/cream-pbs-
> > thisisbogus ./gina
> > 2010-10-13 15:57:55,017 WARN - No configuration file suitable for
> > loading. Using built-in configuration
> > https://creamce.gina.sara.nl:8443/CREAM392820764
> > 
> > maarten$ glite-ce-job-status
> > https://creamce.gina.sara.nl:8443/CREAM392820764 2010-10-13 15:58:08,130
> > WARN - No configuration file suitable for loading. Using built-in
> > configuration
> > 
> > ******  JobID=[https://creamce.gina.sara.nl:8443/CREAM392820764]
> > 
> >        Status        = [REGISTERED]
> > 
> > Anyone got an idea on whats going on?
> > I have the feeling this is something small I am overlooking :-) but it
> > keeps me busy.
> > 
> > Cheers,
> > Maarten
> > 
> > 
> > SARA Computing and Networking Services
> > PO Box 94613
> > 1090 GP Amsterdam, Netherlands
> > 
> > Tel: +31 (0)20 592 3000
> > Fax: +31 (0)20 668 3167
> 
>                     \|||/
> -----------0oo----( o o )----oo0-------------------
>                      (_)
> INFN Sezione di Padova
> Via Marzolo, 8
> 35131 Padova - Italy    E-mail: massimo.sgaravatto [at] pd.infn.it
> Tel: ++39 0498275908    Skype: massimo.sgaravatto
> Fax: ++39 0498275952

-- 
ing. M.H. van Ingen, HPC&V Systems Programmer

SARA Computing and Networking Services
PO Box 94613
1090 GP Amsterdam, Netherlands

Tel: +31 (0)20 592 3000
Fax: +31 (0)20 668 3167