Hi all,
I'm at CERN for a few days this week so won't make it tomorrow. Here are
some rough notes from the meeting today:
Middleware releases
-------------------
No DPM 1.6.7-[12] on SL4/64 in production. This only available in
certification at the moment.
SL4 CE recently certified.
WN gLite 3.1
- SL4/32 in production
- some issues with the DM clients
GFAL/lcg_util
- 1.10.7-2 (-1 in production)
- 1.6.6.-1
But 1.6.7 of lcg_utils is required to fix the bug about specifying the
number of streams. This was causing LHCb jobs to crash.
Sites need to upgrade to the gLite 3.1 WN on SL4.
CASTOR
------
SRM next release by end of March. 1.3.1-12 required for srmCopy.
End of May implement purgeFromSpace.
DPM
---
1.6.10 at the end of February.
IPv6 support in 1.6.9
1.7.0
- srmCopy (!)
- common rfio
- space tokens in direct rfio and gridftp (no SRM)
- beginning of April
Handling of space tokens
- prepare to put
- ignored in prepare to get/bring online for files already online
- disk pools dedicated to list of groups
- spaces can be dedicated to a user DN or a single group
- files can be stored without specifiyiung space token.
ACLs on spaces?
GIP will go into the 1.6.10 release.
StoRM
-----
Are implementing T1D1 - T1 is therefore a backup of what is on disk.
People are wondering if they are able to implement T1D0.
Strongly coupled to GPFS+TSM.
Going to be used by LHCb - who will actually do this during Feb CCRC.
File will be restored to a disk and pinned. The question about the
restore is that it will no longer be part of the space that the file was
put in via.
ATLAS
-----
Part of CCRC is data deletion.
If we have to specify which data goes to which tapes then we have failed
with SRM.
ATLASDATADISK and ATLASDATATAPE are important for now. No MC at the
moment. ATLAS testing these. Working at some sites and not at others.
They have been defined everywhere.
LHCb
----
lcg_util 1.6.7 must be on WN for LHCb.
- also includes lcg-getturls command which does bulk
lookups of BDII for information.
Grave concerns over the limitations of the SRM implementation in CASTOR
and dCache.
- WAN/LAN pools
CCRC using DIRAC3
- processingDB
- gLite WMS
- T1-T1 transfer. job finalisation is late
- S/W installation problems. Fallback available.
- is this something the sites can help with?
- conditionsDB will not be used.
- Looks like there will be no MC production => no T2 involvement
CMS
---
Functional blocks. Want to do some pre-staging, then potentially some
reprocessing at the moment. Transfers will be as independent as
possible. Let the CMS T1 do some T1->T2 and T1->T1 using Phedex.
How much space is required at the Tier-2s for this?
CPU use at the T2 - fast sim production, dependent on the software. Not
asking anything special about the fairshare.
CMS_DEFAULT - all of CMS should be able to access this.
- still somewhat confusing by saying that it is 'optional'
- seems that the informaiton about space tokens will be
passed with PheDex request, but can be ignored if sites do
not want to use the space tokens.
60 (or 30) day lifetime for the data.
- experiments expected to clean up the disk data
- need bulk methods in SRM2
- sites deal with the tapes
- but experiments want to be able to do it the same way.
Management and Operations
-------------------------
Monday meetings at 4pm. Every other day at 3pm.
ATLAS shifters will use GGUS to track the problems with transfers. This
seems to take a long time.
ServiceMaps look interesting for augmenting the information that is in
the dashboards. Different size is the importance.
Colour coding can be time dependent.
This could eventually move into the experiment frameworks.
ServiceMap stuff could pull information from nagios at sites.
On 05/02/08 15:24, Jensen, J (Jens) wrote:
> Now uploaded.
> http://indico.cern.ch/conferenceDisplay.py?confId=28666
> For benefit of new or recent members, it's tomorrow 10:00-10:30 in EVO,
> with the usual cunning secret password ("dteam").
> Cheers
> --jens
|