Steve Traylen a écrit : >On Fri, Jan 28, 2005 at 05:06:48PM -0000 or thereabouts, pierre girard wrote: > > >>Hi Min, >> >>Unfortunately, this CPU count strategy does not work with CCIN2P3-LCG2 >>site. >> >> > >I don't think it really matters to the system that the CPU count is wrong. >It might matter for to publicity of course but so long as the >ETT time goes up when you start queuing jobs then things will >be reasnoable. > > I completely agree with that. So, unfortunately for Min, there is no current solution to easily estimate the real CPU count of all the sites. Pierre > Steve > > >>Indeed, the real number of physical CPUs supplied to the grid at CC is >>100 CPUs at this moment, and more than 700 CPUs very soon (hopefully >>next week). >> >>With our batch system, we define in general 2 queues by physical CPU, >>but each "queue" (called a WorkPoint) can be configured to accept >>several classes of jobs (something like short (A), medium (G) and long >>(T)). >> >>At this moment, the Glue schema does not allow us to express the subtely >>of our batch system queue mechanism. As a consequence, it is impossible >>for you to infer the total of real CPUs from the published data. What we >>currently publishes are virtual CPUs instead of real CPUs. >> >>The solution consisting in adding CPU counts at Subcluster level, >>proposed by stephen, should be the solution to our problem. >> >>Anyway, in our specific and current case, the solution is to take the >>max of the queues. >> >>But a solution could be to sum systematically the CPUs of each queue by >>site. Indeed, this value has the same meaning for all the sites. >>According to me, it reflects the number of jobs a site pretends to be >>able to perform simultaneously, what I call the virtual CPUs. It is the >>role of the site administrator to set correctly this value in >>concordance with the real capacities of his/her site. >> >>When we will have the possibility to get the real CPU count, it will be >>very interesting to compute the ratio between virtual CPUs and real ones. >> >>Hope this helps ;). >> >>Pierre >> >> >> >> >> >> >>Min Tsai a écrit : >> >> >> >>>Hi All, >>> >>>The fix is in for the CPU count. Three other sites had their CPU stats >>>change: CCIN2P3-LCG2, INFN-LNL-LCG, INFN-PADOVA. Let me know if these >>>number are inaccurate for some reason. >>> >>>Best Regards, >>>Min >>> >>>-----Original Message----- >>>From: LHC Computer Grid - Rollout [mailto:[log in to unmask]] On >>>Behalf Of Min Tsai >>>Sent: Wednesday, January 26, 2005 12:41 PM >>>To: [log in to unmask] >>>Subject: Re: [LCG-ROLLOUT] TotalCPU count on the GOC Mon >>> >>>Dear Anar, >>> >>>Typically CPU stats for queues on a single CE repeat all refer to the same >>>set of CPUs. So to prevent recount of CPU Gstat adds up the CPU stats for >>>the first queue it encounters for each unique CE. So in your case: >>> >>>It adds up: >>>GlueCEUniqueID=lcg03.gsi.de:2119\/jobmanager-torque-alice 2 CPU >>>GlueCEUniqueID=lcg06.gsi.de:2119\/jobmanager-lcglsf-alice 16 CPU >>> >>>I have not noticed a configuration like yours before, so I will make a >>>modification by adding CPU from queues that have different total CPU >>>statistics even though they reside on the same CE. The only problem we will >>>have if when 2 queues on a single CE has the same total CPU count even >>>though they are referring to 2 completely different clusters. >>> >>>I hope this will correct the CPU problem for you site. Thank you for >>>providing this feedback! I will let you know once this I have tested and >>>complete this change. >>> >>>Cheers, >>>Min >>> >>> >>> >>> >>> >>>-----Original Message----- >>>From: LHC Computer Grid - Rollout [mailto:[log in to unmask]] On >>>Behalf Of Anar Manafov >>>Sent: Wednesday, January 26, 2005 11:59 AM >>>To: [log in to unmask] >>>Subject: [LCG-ROLLOUT] TotalCPU count on the GOC Mon >>> >>>Good day to ALL! >>> >>>I have mentioned that on the monitoring (http://goc.grid.sinica.edu.tw/ >>>gstat/lcg03.gsi.de/) we (GSI) publishing only 18 CPU (Total CPU). So, I >>>wonder how this number is calculated and why not all of the queues are >>>affected. >>>We have 2 different CE: >>>Torque CE (with 2 CPU). >>>LSF CE (more than 300 CPU), >>>in LSF we have “dteam” and “alice” queues. >>>For “alice” ~ 16 PCU >>>For “dteam” ~ 344 CPU or something (Later on, when we finish >>>the test of >>>our new pool-accounts algorithm we will publish more CPU on the >>>“alice”). >>> >>>So, my question would be which algorithm monitoring uses to calculate Total >>>CPU amount? >>> >>>I would appreciate any comment on this. >>> >>>Thank you very much in advance. >>> >>>Best of luck, >>> >>>Anar >>> >>> >>> >>> >>> >>-- >>______________________ >>Pierre GIRARD >>Grid Computing Team Member >>IN2P3/CNRS Computing Centre - Lyon (FRANCE) >>http://cc.in2p3.fr >>Tel. +33 4.78.93.08.80 | Fax. +33 4.72.69.41.70 | e-mail: [log in to unmask] >> >> > >-- >Steve Traylen >[log in to unmask] >http://www.gridpp.ac.uk/ > > > -- ______________________ Pierre GIRARD Grid Computing Team Member IN2P3/CNRS Computing Centre - Lyon (FRANCE) http://cc.in2p3.fr Tel. +33 4.78.93.08.80 | Fax. +33 4.72.69.41.70 | e-mail: [log in to unmask]