On Sat, 20 Aug 2005, [log in to unmask] wrote:
>> A little correction to this: I have replaced qmgr and pbsnodes binaries
>> with my wrappers to catch the commandline and log the output that is
>> passed to the lcg-info-* script by these PBS commands. They seemed perfect
>> even at the times when incorrect values were published.
> You are looking at cached information.
> Let the wrappers also log the time taken by the real commands.
> The algorithm of the generic information provider is as follows:
> if the cached information is older than 20 seconds, the dynamic plug-in
> is run and its output will be used if it comes within 5 seconds,
> otherwise the cached values will be used, unless the file is more than
> 10 minutes old, in which case the static defaults are used.
> The dynamic plug-in may continue in the background to refresh the cached
> information, for up to 10 minutes, after which it will be killed.
Hello Maarten,
thanks for your reply. I have meanwhile digged into the perl information
provider scripts and found the information you provided. Then I have
tracked the problem to the fact that time to time the file
/opt/lcg/var/gip/tmp/lcg-info-dynamic-ce.ldif.<number> gets truncated to
the zero length, and therefore the corrupted cached information is
provided.
I can easily fix this by running the lcg-info-wrapper as the edguinfo
user, which regenerates the file with cached information and correct data
are provided again.
I have however not yet found out why the file gets truncated. Seems to me
it gets truncated every minute or so:
while true; do ll /opt/lcg/var/gip/tmp/; sleep 5; done
total 8
-rw-r--r-- 1 edginfo edginfo 1754 Aug 20 00:19
lcg-info-dynamic-ce.ldif.3383
-rw-r--r-- 1 edginfo edginfo 843 Aug 20 00:19
lcg-info-dynamic-software.ldif.7652
total 8
-rw-r--r-- 1 edginfo edginfo 1754 Aug 20 00:19
lcg-info-dynamic-ce.ldif.3383
-rw-r--r-- 1 edginfo edginfo 843 Aug 20 00:19
lcg-info-dynamic-software.ldif.7652
total 8
-rw-r--r-- 1 edginfo edginfo 1754 Aug 20 00:19
lcg-info-dynamic-ce.ldif.3383
-rw-r--r-- 1 edginfo edginfo 843 Aug 20 00:19
lcg-info-dynamic-software.ldif.7652
total 4
-rw-r--r-- 1 edginfo edginfo 0 Aug 20 00:20
lcg-info-dynamic-ce.ldif.3383
-rw-r--r-- 1 edginfo edginfo 843 Aug 20 00:20
lcg-info-dynamic-software.ldif.7652
We can see that it got truncated exactly on the border of the minute. Now
I regenerate it
./lcg-info-wrapper >/dev/null
And start the "watching" cycle again. It gets truncated on the next minute
border again:
-rw-r--r-- 1 edginfo edginfo 1754 Aug 20 00:21
lcg-info-dynamic-ce.ldif.3383
-rw-r--r-- 1 edginfo edginfo 843 Aug 20 00:21
lcg-info-dynamic-software.ldif.7652
total 8
-rw-r--r-- 1 edginfo edginfo 1754 Aug 20 00:21
lcg-info-dynamic-ce.ldif.3383
-rw-r--r-- 1 edginfo edginfo 843 Aug 20 00:21
lcg-info-dynamic-software.ldif.7652
total 8
-rw-r--r-- 1 edginfo edginfo 1754 Aug 20 00:21
lcg-info-dynamic-ce.ldif.3383
-rw-r--r-- 1 edginfo edginfo 843 Aug 20 00:21
lcg-info-dynamic-software.ldif.7652
total 4
-rw-r--r-- 1 edginfo edginfo 0 Aug 20 00:22
lcg-info-dynamic-ce.ldif.3383
-rw-r--r-- 1 edginfo edginfo 843 Aug 20 00:22
lcg-info-dynamic-software.ldif.7652
I have not yet tracked down what is the reason of this truncation. If you
have any idea I would very appreciate it.
Thanks,
--
Jiri Kosina
Institute of Physics, Academy of sciences of the Czech Republic
|