Hi,
I have another thing to report regarding cream at emi2/sge:
the cluster status is located in /etc/lrms/cluster.state.
But changed it for example to "draining" doesn't have an effect doing
ldapsearch from remote:
ldapsearch -x -LLL -h cream-ge-3-kit:2170 -b o=grid | grep Status
GlueCEStateStatus: Production
GlueCEStateStatus: Production
GlueServiceStatus: OK
...
But on the host itself:
[root@cream-ge-3-kit ~]# /var/lib/bdii/gip/plugin/glite-info-dynamic-ce
| grep Status
Use of uninitialized value in printf at
/usr/libexec/glite-info-dynamic-sge line 428.
GlueCEStateStatus: Draining
GlueCEStateStatus: Draining
So, ldap could not update .ldif files properly. why?
Regards
On 11/07/2012 04:53 PM, Jeff Templon wrote:
> Hi
>
> for the PBS batch system, the output of the two plugins is synchronized ... glite-info-dynamic-ce does not print any information that overlaps with lcg-info-dynamic-scheduler. This may not be the case for SGE ... I know at some point the SGE folk had modified the lcg-info-dynamic-scheduler program -- which should never be done, the batch system specific stuff should all be handled by the lrms plugin script, not in the (generic) lcg-info-dynamic-scheduler script.
>
> It may have been fixed now (using the std dynamic scheduler) but you should check.
>
> If both the dynamic-ce and dynamic-scheduler are printing values for the same attribute, I think it's not predictable which value you will get.
>
> JT
>
> On Nov 7, 2012, at 16:25 , Andras Hazi wrote:
>
>> Hi Massimo,
>>
>> The amount of total cpu's and other values seem to be ok now, except
>> the 'free' column.
>> When I try to use by hand the plugins in /var/lib/bdii/gip/plugin
>>
>> (for example: /sbin/runuser -s /bin/sh ldap -c "/var/lib/bdii/gip/plugin/glite-info-dynamic-ce")
>>
>> I have the right data in the output, but in the ldapsearch output there
>> are wrong numbers for free slots (GlueCEStateFreeJobSlots):
>>
>> manual run:
>> dn: GlueCEUniqueID=grid106.kfki.hu:8443/cream-sge-cms,mds-vo-name=resource,o=grid
>> GlueCEInfoLRMSVersion: unknown
>> GlueCEPolicyAssignedJobSlots: 380
>> GlueCEPolicyMaxRunningJobs: 380
>> GlueCEInfoTotalCPUs: 380
>> GlueCEStateFreeJobSlots: 253
>> GlueCEStateFreeCPUs: 253
>> GlueCEPolicyMaxCPUTime: 4320
>> GlueCEPolicyMaxWallClockTime: 4320
>> GlueCEStateStatus: Production
>>
>> ldapsearch:
>> dn: GlueCEUniqueID=grid106.kfki.hu:8443/cream-sge-cms,Mds-Vo-name=resource,o=g
>> rid
>> GlueCEStateStatus: Production
>> GlueCEStateFreeCPUs: 0
>> GlueCEPolicyPriority: 1
>> GlueCEInfoJobManager: sge
>> GlueCEInfoHostName: grid106.kfki.hu
>> GlueCEUniqueID: grid106.kfki.hu:8443/cream-sge-cms
>> GlueCEStateFreeJobSlots: 0
>> GlueForeignKey: GlueClusterUniqueID=grid106.kfki.hu
>>
>> In bdii log there is no error/warning.
>> But in /var/lib/bdii/modify.err I've found this:
>>
>> request done: ld 0xa387620 msgid 1
>> request done: ld 0xa387620 msgid 2
>> request done: ld 0xa387620 msgid 3
>> request done: ld 0xa387620 msgid 4
>> request done: ld 0xa387620 msgid 5
>> request done: ld 0xa387620 msgid 6
>> request done: ld 0xa387620 msgid 7
>>
>>
>> What could be the problem?
>>
>> Best Regards,
>> Andras
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 2012-10-12 10:14 időpontban Massimo Sgaravatto ezt írta:
>>> On 10/12/2012 11:05 AM, Andras Hazi wrote:
>>>>
>>>>
>>>>>> On 2012-10-12 07:49 Massimo Sgaravatto wrote:
>>>>>>
>>>>
>>>>> Which one(s) ?
>>>>> Please specify the name of the attribute(s) in glite-ce-glue2.conf
>>>>> and in the *glue2* ldif file(s) that are different
>>>>
>>>> in glite-ce-glue2.conf:
>>>> ExecutionEnvironment_grid106.kfki.hu_PhysicalCPUs = 93
>>>> ExecutionEnvironment_grid106.kfki.hu_LogicalCPUs = 4
>>>>
>>>> in static-file-ce.ldif:
>>>> GlueCEInfoTotalCPUs: 0
>>>
>>>
>>> These are different things. The relevant attributes for glue1 are:
>>>
>>> GlueSubClusterLogicalCPUs
>>> GlueSubClusterPhysicalCPUs
>>>
>>>
>>>
>>>>
>>>> But I've found the right *glue2* related data in ExecutionEnvironment.ldif:
>>>>
>>>> GLUE2ExecutionEnvironmentPhysicalCPUs: 93
>>>> GLUE2ExecutionEnvironmentLogicalCPUs: 4
>>>>
>>>> And the output of lcg-infosites(...) command shows 0 CPU's, which has to be
>>>> the count of TotalCPUs.
>>>>
>>>>
>>>>
>>>>> You can try to run by hand the plugins in /var/lib/bdii/gip/plugin:
>>>>>
>>>>> /sbin/runuser -s /bin/sh ldap -c
>>>>> "/var/lib/bdii/gip/plugin/glite-info-dynamic-ce"
>>>>
>>>> The output is (part of it):
>>>>
>>>> dn:
>>>> GlueCEUniqueID=grid107.kfki.hu:8443/cream-sge-alice,mds-vo-name=resource,o=gridGlueCEInfoLRMSVersion:
>>>> unknown
>>>> GlueCEPolicyAssignedJobSlots: 0
>>>> GlueCEPolicyMaxRunningJobs: 0
>>>> GlueCEInfoTotalCPUs: 0
>>>> GlueCEStateFreeJobSlots: 0
>>>> GlueCEStateFreeCPUs: 0
>>>> GlueCEPolicyMaxCPUTime: 9072
>>>> GlueCEPolicyMaxWallClockTime: 10080
>>>> GlueCEStateStatus: Production
>>>>
>>>> dn:
>>>> GlueCEUniqueID=grid107.kfki.hu:8443/cream-sge-cms,mds-vo-name=resource,o=gridGlueCEInfoLRMSVersion:
>>>> unknown
>>>> GlueCEPolicyAssignedJobSlots: 380
>>>> GlueCEPolicyMaxRunningJobs: 380
>>>> GlueCEInfoTotalCPUs: 380
>>>> GlueCEStateFreeJobSlots: 0
>>>> GlueCEStateFreeCPUs: 0
>>>> GlueCEPolicyMaxCPUTime: 4320
>>>> GlueCEPolicyMaxWallClockTime: 4320
>>>> GlueCEStateStatus: Production
>>>>
>>>> And it seems to be the right dynamic data: GlueCEInfoTotalCPUs: 380
>>>>
>>>>
>>>>
>>>> Now the question is, why can't I see it with lcg-infosites(..) command?
>>>>
>>>>
>>>
>>>
>>> Did you check the bdii logs ?
>>> You might need to increase the verbosity to DEBUG
>>>
>>>
>>>>
>>>> Regards
>>>> Andras
--
Dimitri Nilsen, Dipl.-Ing(FH)
Karlsruhe Institute of Technology (KIT)
Steinbuch Centre for Computing
Kaiserstr. 12
76131 Karlsruhe, Germany
Tel.: +49 721 608 28607
Email: [log in to unmask]
|