Hi Jeremy,
It's the same here as well - jobs get wrongly assigned. Though we have
8-Gig per system (4-core machine), we publish only 1 Gig MainMemory per
core:
[root@serv03 root]# ldapsearch -x -H ldap://serv03.hep.phy.cam.ac.uk:2170 -b mds-vo-name=UKI-SOUTHGRID-CAM-HEP,o=grid | grep Memory
objectClass: GlueHostMainMemory
GlueHostMainMemoryRAMSize: 1024
GlueHostMainMemoryVirtualSize: 2048
but jobs are coming here with the requirement:
**********************************************************************************
The Requirements expression for your job is:
( target.Arch == "INTEL" ) && ( target.OpSys == "LINUX" ) &&
( target.Disk >= DiskUsage ) && ( ( target.Memory * 1024 ) >= ImageSize ) &&
( target.HasFileTransfer )
Condition Machines Matched Suggestion
--------- ---------------- ----------
1 ( ( 1024 * target.Memory ) >= 2240000 )0 REMOVE
2 ( target.Arch == "INTEL" ) 142
3 ( target.OpSys == "LINUX" ) 142
4 ( target.Disk >= 10000 ) 142
5 ( target.HasFileTransfer ) 142
WARNING: Be advised:
No resources matched request's constraints
**********************************************************************************
So, doing nothing other than occupying a core for nothing.
Cheers,
Santanu
Coles, J (Jeremy) wrote:
> Hi Duncan
>
> A fair point that I was also thinking about. This is the comment I have
> received indirectly from Heinz:
>
> "> That 's mainly a site configuration problem that I have already
> sorted
>
>> out with a site in Croatia. The main problem is that a 4-core machine
>> publishes 4GB RAM, however, every core then only has about 1 GB. The
>> Resource Broker assumes that the RAM amount published is per _core_
>> (which is reasonable). Therefore, jobs get wrongly assigned. That
>>
> seems
>
>> to be a side effect of new processor technology. I already have a JDL
>> limit of 2 GB RAM which works for most of the sites that follow the
>> "traditional" way of publishing CPU power."
>>
>
> Can anyone define the "traditional way"?
>
> Jeremy
>
>
>> -----Original Message-----
>> From: Testbed Support for GridPP member institutes [mailto:TB-
>> [log in to unmask]] On Behalf Of Duncan Rand
>> Sent: 01 November 2007 08:58
>> To: [log in to unmask]
>> Subject: jobs using up too much memory
>>
>> Isn't there still some confusion surrounding the term
>> GlueHostMainMemoryRAMSize - is it the RAM per node or the RAM per
>> core? I suspect that jobs often request RAM per job and sites
>> advertise RAM per node, as is recommended in the yaim documentation:
>>
>> CE_MINPHYSMEM RAM size (Mbytes) (per WN and not per CPU) (WN
>> specification).
>>
>> http://www.ogf.org/pipermail/glue-wg/2007-August/000151.html
>>
>> The real problem is that the job's requirements are not passed to the
>> scheduler. If they were it would be able to operate as intended and
>> manage node memory properly.
>>
>> Duncan
>>
|