Emyr,
(Note: do you run SGE and CREAM on same server?)
The SGE_* environment variables (SGE_ROOT etc.) are from GridEngine.
Usually
they are set from some batch script that you source before running the
grid engine
tools. We don't know for sure if they are set properly at the time you
run YAIM. So it might be worth doing this.
# echo $SGE_ROOT
Find what SGE_ROOT is set to, then (in site-info.def) set
BATCH_LOG_DIR=/abc/def/default/spool
I.e. hardcode the SGE_ROOT, just in case.
Also, make sure logs are actuually show up in BATCH_LOG_DIR, and check
/etc/glite-apel-pbs/parser-config-yaim.xml such that it says this:
<Logs searchSubDirs="yes" reprocess="no">
<Dir>/var/torque/server_priv/accounting</Dir>
</Logs>
So, in short, hardcode BATCH_LOG_DIR, check that the directory actually
has logs, and that they update when work gets done, and
check that apel is pointed at that dir, and that is has
permission to see and open the files.
If all that is done, then it has no excuse not to work!!!
Good luck,
Steve
On 10/30/2012 10:36 AM, emyr.james wrote:
> Hi,
>
> Thanks for the info Steve. I think the problem is the setting for
> BATCH_LOG_DIR.
> I have it set to...
>
> $SGE_ROOT/default/spool
>
> ...Is this what other SGE users have it set to ? If not, where do you
> point BATCH_LOG_DIR at for SGE ?
>
> Regards,
>
> Emyr
>
> On 30/10/12 09:29, Stephen Jones wrote:
>> Hi,
>>
>> The key difference between my output and yours is this:
>>
>> --- mine -
>>
>> > apel-pbs-log-parser - **** Updating PBS end event table
>> (EventRecords) ****
>> > apel-pbs-log-parser - Processing batch log file:
>> hepgrid5.ph.liv.ac.uk /var/torque/server_priv/accounting/20121029
>> > apel-pbs-log-parser - Event records inserted: 2566
>>
>> --- yours -
>>
>> apel-sge-log-parser - **** Updating SGE end event table
>> (EventRecords) ****
>> apel-sge-log-parser - Event records inserted: 0
>>
>> ----------
>>
>> Basically, I think John had it right - APEL found no accounting records.
>> I think blahd records are data that describe submissions and job
>> transitions from one state to another as the pass through the batch
>> system, while event records are data that describe actual work done.
>>
>> Obviously, they must tally, in due course.
>>
>> But in your system, the APEL in your CREAM CE is not finding any SGE
>> accounting records (that's the theory, anyway).
>>
>> I don't know if (A) CREAM runs on the SGE head node or (B) you
>> have different servers for CREAM and SGE.
>>
>> If B, you need to NFS share (export) the account data from SGE
>> to CE somehow. If A, you just have to tell APEL where the
>> accounting data is.
>>
>> On my CE (which uses option B with PBS), this is done in
>> /etc/glite-apel-pbs/parser-config-yaim.xml
>> i.e.
>> <Logs searchSubDirs="yes" reprocess="no">
>> <Dir>/var/torque/server_priv/accounting</Dir>
>> </Logs>
>>
>> Obviously, permissions must be right, the directory must be
>> available (whether in option A or B). Check all that, then
>> see if you get any Event Records.
>>
>> If so, that part of the data flow is correct and we can move on
>> to the next tests.
>>
>>
>> Steve
--
Steve Jones [log in to unmask]
System Administrator office: 220
High Energy Physics Division tel (int): 42334
Oliver Lodge Laboratory tel (ext): +44 (0)151 794 2334
University of Liverpool http://www.liv.ac.uk/physics/hep/
|