On 17/12/12 17:28, Andrew Washbrook wrote:
> Hi,
>
> We have had some recent CREAM related issues at our site which I would thought I would publicise in case other sites have seen (or will see) similar symptoms:
>
> 1 - https://ggus.eu/tech/ticket_show.php?ticket=89211
> - A memory leak was observed on the host running CREAM software
> - Approximately 2 hours after the service restart (this will depend on host memory available) OOM errors are seen in the Catalina logs and connections to the cream service are refused
> - This is related to the changes in VOMS API Java 2.0.9 which were released in EMI 2.5
> - A fix will be available in the next release
>
> 2 - https://ggus.eu/tech/ticket_show.php?ticket=88284
> - Problem polling qstat/qconf for sites using Gridengine
> - The SGE_ROOT environment variable is not exported correctly to blah deamons
> - All jobs appear to fail with "Cancelled by CE admin" message but in fact jobs are submitted and terminate successfully on the batch system
> - Savannah bug created: https://savannah.cern.ch/bugs/index.php?99351
> - As an interrim patch put "export SGE_ROOT path_to_sge" in blah.config
That would appear to be a duplicate of:
https://ggus.eu/ws/ticket_info.php?ticket=79923
As it clearly isn't just me, then the documentation clearly needs to be
improved.
>
> Please feel free get in touch if you would like further information on these issues.
>
On cream issues, QMUL seems to be hitting the max gridftp load limits
described in the thread on lcg-rollout:
[LCG-ROLLOUT] Bug in glite-ce-cream-utils-1.2.1-0.sl6.x86_64 ???
See
https://wiki.italiangrid.it/twiki/bin/view/CREAM/SystemAdministratorGuideForEMI2#3_14_Self_limiting_CREAM_behavio
for how to change them.
Chris
|