On 08/21/2012 03:54 PM, Govind Songara wrote:
[log in to unmask]" type="cite">Hi Lisa,

Please have look at tomcat log file
http://www.pp.rhul.ac.uk/~gsongara/cream2/catalina.out

I am using emi-1

There is no 1.14 available.

sorry, I mean EMI-2.

Cheers,
Lisa

[log in to unmask]" type="cite">Installed Packages
glite-ce-cream.noarch                    1.13.4-1.sl5                     installed   
Available Packages
glite-ce-cream.noarch                    1.13.5-1.sl5                     EMI-1-updates

Cheers
Govind



On Tue, Aug 21, 2012 at 2:48 PM, Lisa Zangrando <[log in to unmask]> wrote:
Hi Govind,

unfortunately the Exception related to the FillUpQueueThread is not in the provided log file.

Just a question: are you using the 1.13 CREAM version? If so I suggest you to upgrade it to 1.14.

Thanks a lot.
Cheers,
Lisa


On 08/21/2012 03:31 PM, Govind Songara wrote:
Hi,

I still CE failing tests.
There some message in tomcat log
- Initializing VOMS certificate store from directory: /etc/grid-security/vomsdir
- VOMS store initialized
AbandonedObjectPool is used (org.apache.commons.dbcp.AbandonedObjectPool@44c6f734)
   LogAbandoned: false
   RemoveAbandoned: true
   RemoveAbandonedTimeout: 30
Exception in thread "FillUpQueueThread" java.lang.NullPointerException
        at org.glite.ce.cream.cmdmanagement.queue.db.CommandQueueDBManager$FillUpQueueThread.run(CommandQueueDBManager.java:243)

Here cream log, if it helps to find any problem.
http://www.pp.rhul.ac.uk/~gsongara/cream2/glite-ce-cream.log

Thanks
Govind


On Mon, Aug 20, 2012 at 4:10 PM, Govind Songara <[log in to unmask]> wrote:
I think morning, i screwed up permission while doing rsync of gridmapdir & software tags dir from old CE, which is reverted back and fixed permissions now, will plan it some time later.

There are no errors in cream log as far as I can see
But nagios still failing test  with error

Logged Reason(s):
- Transfer to CREAM failed due to exception: CREAM Register raised std::exception Connection to service [https://ce3.ppgrid1.rhul.ac.uk:8443/ce-cream/services/CREAM2] failed: FaultString=[connection error] - FaultCode=[SOAP-ENV:Client] - FaultSubCode=[SOAP-ENV:Client] - FaultDetail=[Connection refused]
- Transfer to CREAM failed due to exception: CREAM Register raised std::exception Connection to service [https://ce3.ppgrid1.rhul.ac.uk:8443/ce-cream/services/CREAM2] failed: FaultString=[connection error] - FaultCode=[SOAP-ENV:Client] - FaultSubCode=[SOAP-ENV:Client] - FaultDetail=[Connection refused]
Status Reason: hit job shallow retry count (1)


CRITICAL: [REGISTERED->Cancelled/Purged [timeout/dropped]]
CRITICAL: [REGISTERED->Cancelled/Purged [timeout/dropped]]
2 min timeout in 'REGISTERED' exceeded. Cancelling the job.
>>> Job's logging info:
glite-ce-job-status -L 1 https://ce3.ppgrid1.rhul.ac.uk:8443/CREAM711472699

>>> Discard the job: Cancel/Purge from CE, delete from UI.


Googling old emails says there was some issue where UI/WMS rejects jobs, not sure if it is related to that.




On Mon, Aug 20, 2012 at 3:30 PM, Stephen Jones <[log in to unmask]> wrote:
On 08/20/2012 03:06 PM, Govind Songara wrote:

RE: CREAM


Hi Lisa,

It is
drwxrwx--- 5 root edguser 196608 Aug 20 12:24 /etc/grid-security/

I have corrected permission. Does ownership also need to be correct from root:edguser to root:root ?

Perhaps not - you have set it so any user can access that dir.  But on my EMI2 cream, both dir and file are root:root.

[root@hepgrid5 grid-security]# ls -lrtd admin-list
-rw-r--r-- 1 root root 154 Aug 20 14:39 admin-list
[root@hepgrid5 grid-security]# ls -lrtd .
drwxr-xr-x 5 root root 4096 Aug 20 14:39 .
[root@hepgrid5 grid-security]#

Easy way to check is to make tomcat loginable (is that a word?) and check that you can su and read file.

RE: SQL


20 Aug 2012 14:56:05,195 ERROR org.glite.ce.creamapi.jobmanagement.cmdexecutor.JobStatusEventManager (JobStatusEventManager.java:100) - (TP-Processor25) Error retrieving events fom database: Can't create/write to file '/tmp/#sql_e28_0.MYI' (Errcode: 13)
20 Aug 2012 14:56:05,195 ERROR org.glite.ce.cream.cmdmanagement.CommandManager (CommandManager.java:383) - (TP-Processor25) CommandManager insertCommand: QUERY_EVENT error Error retrieving events fom database: Can't create/write to file '/tmp/#sql_e28_0.MYI' (Errcode: 13)
I can write to tmp and there is enough space (14Gb free)


Dunno. Never seen it.  By the look and feel of this (I'm more of an Oracle man) I'd suggest that the sql_e28_0.MYI file is a "temporary sort run" from a SQL statement.
You can write to tmp and there is enough space, but can the owner of the database write to tmp? Are the permissions of /tmp root:root 777?

This fella sees the same thing: http://xenforo.com/community/threads/cant-create-write-to-file-errcode-13.29469/

And this guys talks about the sticky bit: http://wordpress.org/support/topic/infamous-cant-createwrite-to-file-errcode13-problem

Steve






--
Steve Jones                             [log in to unmask]
System Administrator                    office: 220
High Energy Physics Division            tel (int): 42334
Oliver Lodge Laboratory                 tel (ext): +44 (0)151 794 2334
University of Liverpool                 http://www.liv.ac.uk/physics/hep/