Kostas Georgiou wrote:
> I am not running torque but pvmem can be an issue in some instances
> * a serial job starts two process so they end up using more than pvmem
> and you can run out of memory :(
If someone does that it's their own fault I think. People running
multi-threaded/multi-process jobs should be requesting the number of slots
they need. Of course this requires many other things sorted out in the
middleware such as the ability to specify and allocate a specific number
of cores on a specific number of nodes.
> * virtual memory != amount of memory that the process is using (for
> example you mmap a 3GB file) so you might end up killing innocent
> jobs. Unfortunately there is nothing else that we can use to limit
> memory so limiting virtual memory is the only option that we have.
Indeed.
> It would definitely help once the CE starts passing down job
> requirements but does anyone expect the users to bother specifying
> the right amount? What will happen with the pilot jobs that obviously
> can not have the right requirements since you can't know at submission
> time what the real job will need?
At least if we move to queues based on memory requirements (and set the
default to have a low limit) then over time users will realise they need
to _ask_ for the amount of memory they need.
Stephen
--
Dr. Stephen Childs,
Research Fellow, EGEE Project, phone: +353-1-8961797
Computer Architecture Group, email: Stephen.Childs @ cs.tcd.ie
Trinity College Dublin, Ireland web: http://www.cs.tcd.ie/Stephen.Childs
|