Anecdotally, we also have a slight increase of jobs being killed for
high memory use, but we have more significant causes of ATLAS job
death than this, so the signal is quite weak.
Sam
On 27 October 2013 14:37, Matt Doidge <[log in to unmask]> wrote:
> Hello,
> We've noticed an increase in the number of jobs being killed for using up
> their memory quota at Lancaster (on our SL6, SGE cluster, 3G hard memory
> limit, xrootd set as transfer protocol of choice). I don't have solid stats
> though, investigating this is on the to do list, so it might not be the same
> issue.
> Cheers,
> Matt
>
>
> On 10/26/2013 08:10 AM, Wahid Bhimji wrote:
>>
>> You may remember that I mentioned that there was memory leak with xrootd
>> direct access. It only affects SL6 and the user has to open a lot of
>> files.
>> But I (or actually Doug - see below) am wondering if any of the sites
>> using xrootd direct access on SL6 have noticed anything.
>>
>> Liv , Glasgow and Oxford (and ECDF) are the only ones who would have
>> seen it.
>> There is a workaround with a simple env variable - but for some reason
>> they want to gather info before applying it.
>>
>> Cheers
>>
>> Wahid
>>
>>
>> Begin forwarded message:
>>
>>> *From: *Doug Benjamin <[log in to unmask]
>>> <mailto:[log in to unmask]>>
>>> *Subject: **Question about possible increased memory consumption - due
>>> to glibc memory allocator change. in SL 6**
>>> *Date: *25 October 2013 21:15:16 BST
>>> *To: *<[log in to unmask]
>>> <mailto:[log in to unmask]>>, <[log in to unmask]
>>> <mailto:[log in to unmask]>>
>>> *Cc: *"atlas-adc-expert (Mailing list for the Atlas Distribution
>>> Computing Expert ...)" <[log in to unmask]
>>> <mailto:[log in to unmask]>>
>>>
>>>
>>> Dear US and UK cloud squads,
>>>
>>> Since the migration to SL6* have the dpm direct read sites or any
>>> sites using
>>> an xrootd dataserver (or xrootd door) or xrootd client from Root or
>>> xrootd client (xrdcp)
>>> seen a sharp increase in virtual memory useage? I have been ask be
>>> Ale DiG to
>>> quantify the effected of this issue. User prun jobs using root are
>>> susceptible to this issue.
>>> Because Athena uses a different memory allocator, they do not exhibit
>>> the problem.
>>>
>>> For some future details of the problem see this link -
>>>
>>>
>>> https://www.ibm.com/developerworks/community/blogs/kevgrig/entry/linux_glibc_2_10_rhel_6_malloc_may_show_excessive_virtual_memory_usage?lang=en
>>>
>>> Please contact me if this message is not clear enough and please let
>>> me know if you
>>> had problems so we can quantify the problem before suggesting the
>>> solution listed in
>>> the above link.
>>>
>>> Thank you in advance for your attention in this matter.
>>>
>>> Cheers,
>>>
>>> Doug Benjamin
>>>
>>>
>>
>>
>>
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>
|