JISCMail - LCG-ROLLOUT Archives

--------------ms070402080205010103060808
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature

MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIKAzCC
BNowggPCoAMCAQICEF3/UOr+D1NGiJ+AQY/nQsgwDQYJKoZIhvcNAQEFBQAwga4xCzAJBgNV
BAYTAlVTMQswCQYDVQQIEwJVVDEXMBUGA1UEBxMOU2FsdCBMYWtlIENpdHkxHjAcBgNVBAoT
FVRoZSBVU0VSVFJVU1QgTmV0d29yazEhMB8GA1UECxMYaHR0cDovL3d3dy51c2VydHJ1c3Qu
Y29tMTYwNAYDVQQDEy1VVE4tVVNFUkZpcnN0LUNsaWVudCBBdXRoZW50aWNhdGlvbiBhbmQg
RW1haWwwHhcNMDkwNTE4MDAwMDAwWhcNMjgxMjMxMjM1OTU5WjBEMQswCQYDVQQGEwJOTDEP
MA0GA1UEChMGVEVSRU5BMSQwIgYDVQQDExtURVJFTkEgZVNjaWVuY2UgUGVyc29uYWwgQ0Ew
ggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDEvCV9ze9ZBKt0Jym2Y4rvxDVwoUYI
r25QmxtxeJcChcA1/5AYEWb6MzkUnqQS9z3uBSyB5/ctLjZ4Qw45tn96dMPLVT8vs8vDN2rh
lQMjG7MimBODWDBfrDyRRdNtCy7L0ZFhsspx+sKJDHsJ1pK/o4EXEVUg3zeBnx4mCR3SFxlD
y4uZJXRBW+YyKtCnU9zSIY3Nkc97bPTJ/tDl0gZrGmFCG4CRB9wUSSc7Coqy4jtpCITLgZnX
uhQf2H6SY732LTH6lU7NNg2Z7xG6rUr7qMR4uXxaPsj1CDLAqQXHjBoojzC6F6PxueIHp7j7
Vb+fYlrVFA8ItBOZ0hJ5WvD9AgMBAAGjggFbMIIBVzAfBgNVHSMEGDAWgBSJgmd9xJ0mcABL
tFBIfN49rgRufTAdBgNVHQ4EFgQUyIlzmaddURZTRVR8o8I5fMvXqoEwDgYDVR0PAQH/BAQD
AgEGMBIGA1UdEwEB/wQIMAYBAf8CAQAwJgYDVR0gBB8wHTANBgsrBgEEAbIxAQICHTAMBgoq
hkiG90wFAgIFMFgGA1UdHwRRME8wTaBLoEmGR2h0dHA6Ly9jcmwudXNlcnRydXN0LmNvbS9V
VE4tVVNFUkZpcnN0LUNsaWVudEF1dGhlbnRpY2F0aW9uYW5kRW1haWwuY3JsMG8GCCsGAQUF
BwEBBGMwYTA4BggrBgEFBQcwAoYsaHR0cDovL2NydC51c2VydHJ1c3QuY29tL1VUTkFBQUNs
aWVudF9DQS5jcnQwJQYIKwYBBQUHMAGGGWh0dHA6Ly9vY3NwLnVzZXJ0cnVzdC5jb20wDQYJ
KoZIhvcNAQEFBQADggEBAAgXpBz5FWuwGWFvoEjjeiTvQVWaoFBw2CPVU4ZKZ47o2lYWGCwb
GCGJupgk7lY04xeGJr0hWtQZk0rqYXRNtsSEjUfuyi5lbTaTmLHikmaI4k57dcdeRGkh3BJq
MPxhgP4P8J3S3H6u5cJTTQtwg2FWRfs933L2AkJ164iKmFdg9Z+ickmxej5BZzXDVSsNBzXo
xivVuod5gHTnkja9RoF6Liniar7hFxM1fBakJTMvYe7OyVLgQNvTvjlaz89MFOV/xUNXi025
Wo7CDwZN3shJnYFzuuQ/mKWTPFlO25s13/5Nv1Wh6WTDRXysj3xH0TrxTnXZkGdA6LEvnhcv
XDUwggUhMIIECaADAgECAhEAgI3TAiw5ilnQ7BjQO/lWvDANBgkqhkiG9w0BAQUFADBEMQsw
CQYDVQQGEwJOTDEPMA0GA1UEChMGVEVSRU5BMSQwIgYDVQQDExtURVJFTkEgZVNjaWVuY2Ug
UGVyc29uYWwgQ0EwHhcNMTIxMTAyMDAwMDAwWhcNMTMxMjAyMjM1OTU5WjCBuzETMBEGCgmS
JomT8ixkARkWA29yZzEWMBQGCgmSJomT8ixkARkWBnRlcmVuYTETMBEGCgmSJomT8ixkARkW
A3RjczELMAkGA1UEBhMCTkwxNDAyBgNVBAoTK1N0aWNodGluZyBBY2FkZW1pc2NoIFJla2Vu
Y2VudHJ1bSBBbXN0ZXJkYW0xNDAyBgNVBAMUK0ZyYW5jaXNjbyBCZXJuYWJlIEZyYW5jaXNj
by5CZXJuYWJlQHNhcmEubmwwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQClwPJz
BweIQCKc70th6DHZ1wy8EyxMMx7JNVSzuFGVkoXJRQ2NbcEhvqv4XMhLBlmRofr3+Ty9Y44Y
Dsp1txd3edK0LrIOL94EHpjM1c9UqF3moMsuPYl/pZQPvgj/6Ss1JSw3k0UJ2GYeE1vDFmIO
AR3RYMryvWQkAfgMSRorMCg75GxpzMzL/BdnB1FO/8A0FF0Fx5KNyUuFn1OSP6flQgbh2GuO
33BekcJlXJ2j1r4TSnASlSbIxdv4nzbj8dDrdlb7PxYRFb9j6oLLydriyeHPr+5tHKlNZ9OM
aXViIPc8ETUwc9QmMXJJ9E9YsMtEnZYj/34SL8ZcwElXX2XbAgMBAAGjggGUMIIBkDAfBgNV
HSMEGDAWgBTIiXOZp11RFlNFVHyjwjl8y9eqgTAdBgNVHQ4EFgQUUef48OEmryQ3kLXw0ROs
b6H2IuYwDgYDVR0PAQH/BAQDAgWgMAwGA1UdEwEB/wQCMAAwHQYDVR0lBBYwFAYIKwYBBQUH
AwQGCCsGAQUFBwMCMCYGA1UdIAQfMB0wDQYLKwYBBAGyMQECAh0wDAYKKoZIhvdMBQICBTBH
BgNVHR8EQDA+MDygOqA4hjZodHRwOi8vY3JsLnRjcy50ZXJlbmEub3JnL1RFUkVOQWVTY2ll
bmNlUGVyc29uYWxDQS5jcmwwegYIKwYBBQUHAQEEbjBsMEIGCCsGAQUFBzAChjZodHRwOi8v
Y3J0LnRjcy50ZXJlbmEub3JnL1RFUkVOQWVTY2llbmNlUGVyc29uYWxDQS5jcnQwJgYIKwYB
BQUHMAGGGmh0dHA6Ly9vY3NwLnRjcy50ZXJlbmEub3JnMCQGA1UdEQQdMBuBGUZyYW5jaXNj
by5CZXJuYWJlQHNhcmEubmwwDQYJKoZIhvcNAQEFBQADggEBAMOJe8YLz0pWXBvfio2lZetm
7mta/dfvrFaLhytjj+Xv9vhyUk5XWAVZsCaa2D22FXA0TtD+KAY1TFQvpg0Xs4LMv8uRUhIz
mlPuNHETzKodwml0ggewLhZPSgYjVlK8a+rj4aY1L2tTWFQ9H2d+f5xLav190Hfqz1DXZJOk
7wVrxPcJ8eIcf9RzMFl69QWjGwg9aJMFIwqlD5PfYQ0SN4ZZKYTGNKGzV1yFxVhwE1HPx0/J
yZtmLM9qAfNDvbSwIKREhxxo3N1iZbH2wiHIrblLD6NHM8yNxD5O8NiBtHuc/rOatGdM8F03
pOIz35GBxKCfPbz5/8BUii+gmQCwSJUxggMlMIIDIQIBATBZMEQxCzAJBgNVBAYTAk5MMQ8w
DQYDVQQKEwZURVJFTkExJDAiBgNVBAMTG1RFUkVOQSBlU2NpZW5jZSBQZXJzb25hbCBDQQIR
AICN0wIsOYpZ0OwY0Dv5VrwwCQYFKw4DAhoFAKCCAaEwGAYJKoZIhvcNAQkDMQsGCSqGSIb3
DQEHATAcBgkqhkiG9w0BCQUxDxcNMTMwNTA2MDgyMzE2WjAjBgkqhkiG9w0BCQQxFgQUpQ2W
u/N6TDdTEqILUGG/8l+siv0waAYJKwYBBAGCNxAEMVswWTBEMQswCQYDVQQGEwJOTDEPMA0G
A1UEChMGVEVSRU5BMSQwIgYDVQQDExtURVJFTkEgZVNjaWVuY2UgUGVyc29uYWwgQ0ECEQCA
jdMCLDmKWdDsGNA7+Va8MGoGCyqGSIb3DQEJEAILMVugWTBEMQswCQYDVQQGEwJOTDEPMA0G
A1UEChMGVEVSRU5BMSQwIgYDVQQDExtURVJFTkEgZVNjaWVuY2UgUGVyc29uYWwgQ0ECEQCA
jdMCLDmKWdDsGNA7+Va8MGwGCSqGSIb3DQEJDzFfMF0wCwYJYIZIAWUDBAEqMAsGCWCGSAFl
AwQBAjAKBggqhkiG9w0DBzAOBggqhkiG9w0DAgICAIAwDQYIKoZIhvcNAwICAUAwBwYFKw4D
AgcwDQYIKoZIhvcNAwICASgwDQYJKoZIhvcNAQEBBQAEggEAk1/3lSljkd3fTN3nsoMgIXDb
1uryc53fTkZ/d+DEIlyQmdeM/68+d9BYGRdEcQ+QGMJRE00wheB4nllDFjxsEfxgfCV4yJ++
j0cfLVscLkqDIVA76/zj1RE/nI1K0581asnjeYN1EewHt815b+4teSr0+JfnK3rjfuVcNsrR
iHfXmGygfSByquROuZgy4g12VI0ruCzbAdJlSaNPwBHE05cYtwfj/+XF5JlOwOCbjv6irr6Y
U37L2sFgSnceBx6H0ZW1duNdgAHolCRV7SqEXPi8ovg8aQ+Iie4jcKzCerOmdWK5RrsOA5UN
UWnYx5ZWycANQ3vhsU8R2pRKGzHC2gAAAAAAAA==
--------------ms070402080205010103060808--
=========================================================================
Date:         Mon, 6 May 2013 10:54:50 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Rodney Walker <[log in to unmask]>
Subject:      Re: Atlas and SL6 WNs issue
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary=047d7b33d9047d077c04dc08dadb
Message-ID:  <[log in to unmask]>

--047d7b33d9047d077c04dc08dadb
Content-Type: text/plain; charset=ISO-8859-1

Hi,
I know the LRZ fix, but it is not nice. I will add something useful to the
ticket.

Cheers,
Rod.



On 6 May 2013 10:23, Paco Bernabe <[log in to unmask]> wrote:

>  Hi everybody,
>
> A GGUS ticket (https://ggus.eu/ws/ticket_info.php?ticket=93242) to our
> site (SARA-MATRIX) regarding a known issue about atlas analysis jobs
> landing on SL6 WNs described in the wiki (
> https://twiki.cern.ch/twiki/bin/view/Atlas/RPMCompatSLC6#compiling_packages_against_SLC5).
> When upgrading our WNs to *CentOS6*, we read the whole wiki and performed
> the necessary actions to avoid issues, but this code we cannot change
> (Meaning that it is not a site responsibility):
>
> 	# for building slc5 binaries on slc6 host
> 	macro CppSpecificFlags "" \
> 	    ATLAS&host-slc6&target-slc5 " -D__USE_XOPEN2K8 "
>
>
> So it looks that the macro doesn't get defined for *CentOS6* sites, but
> only slc6 sites (At least at our site doesn't). Now, the user that opened
> the ticket sent us a link to another GGUS ticket (
> https://ggus.eu/ws/ticket_info.php?ticket=93271) of a site (LRZ-LMU) that
> had the same problem and a workaround was found; this site has installed
> SLES11 SP2 which is compatible to SL6. If anybody of LRZ-LMU gets this
> email, could you let us know about that workaround? Is there any other site
> that support ATLAS and with SL6 (Or compatible) that has also this issue?
> Please, let us know.
>
> --
>
> Met vriendelijke groeten / Best regards,
>
> *Paco Bernabe*
>
> | Systemsprogrammer | SURFsara | Science Park 140 | 1098XG Amsterdam | T +31
> 610 961 785 | [log in to unmask] | www.surfsara.nl |
>
>
>
>


-- 
Tel. +49 89 289 14152

--047d7b33d9047d077c04dc08dadb
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div><div>Hi,<br></div>I know the LRZ fix, but it is not n=
ice. I will add something useful to the ticket.<br><br>Cheers,<br></div>Rod=
.<br><div><span class=3D""></span><div><span class=3D""><br></span></div></=
div>
</div><div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On 6 Ma=
y 2013 10:23, Paco Bernabe <span dir=3D"ltr">&lt;<a href=3D"mailto:paco@sur=
fsara.nl" target=3D"_blank">[log in to unmask]</a>&gt;</span> wrote:<br><bloc=
kquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #cc=
c solid;padding-left:1ex">

 =20
   =20
 =20
  <div text=3D"#000000" bgcolor=3D"#FFFFFF">
    Hi everybody,<br>
    <br>
    A GGUS ticket (<a href=3D"https://ggus.eu/ws/ticket_info.php?ticket=3D9=
3242" target=3D"_blank">https://ggus.eu/ws/ticket_info.php?ticket=3D93242</=
a>)
    to our site (SARA-MATRIX) regarding a known issue about atlas
    analysis jobs landing on SL6 WNs described in the wiki
    (<a href=3D"https://twiki.cern.ch/twiki/bin/view/Atlas/RPMCompatSLC6#co=
mpiling_packages_against_SLC5" target=3D"_blank">https://twiki.cern.ch/twik=
i/bin/view/Atlas/RPMCompatSLC6#compiling_packages_against_SLC5</a>).
    When upgrading our WNs to <b>CentOS6</b>, we read the whole wiki
    and performed the necessary actions to avoid issues, but this code
    we cannot change (Meaning that it is not a site responsibility): <br>
    <pre>	# for building slc5 binaries on slc6 host
	macro CppSpecificFlags &quot;&quot; \
	    ATLAS&amp;host-slc6&amp;target-slc5 &quot; -D__USE_XOPEN2K8 &quot;</pr=
e>
    <br>
    So it looks that the macro doesn&#39;t get defined for <b>CentOS6</b>
    sites, but only slc6 sites (At least at our site doesn&#39;t). Now, the
    user that opened the ticket sent us a link to another GGUS ticket (<a h=
ref=3D"https://ggus.eu/ws/ticket_info.php?ticket=3D93271" target=3D"_blank"=
>https://ggus.eu/ws/ticket_info.php?ticket=3D93271</a>)
    of a site (LRZ-LMU) that had the same problem and a workaround was
    found; this site has installed SLES11 SP2 which is compatible to
    SL6. If anybody of LRZ-LMU gets this email, could you let us know
    about that workaround? Is there any other site that support ATLAS
    and with SL6 (Or compatible) that has also this issue? Please, let
    us know.<span class=3D"HOEnZb"><font color=3D"#888888"><br>
    <br>
    <div>-- <br>
     =20
     =20
     =20
     =20
     =20
     =20
     =20
     =20
      <p style=3D"margin-bottom:0in">Met vriendelijke groeten / Best
        regards,</p>
      <p style=3D"margin-bottom:0in">
       =20
        <b>Paco Bernabe</b></p>
      <p style=3D"margin-bottom:0in">| Systemsprogrammer | SURFsara |
        Science Park 140 | 1098XG Amsterdam | T <a href=3D"tel:%2B31%20610%=
20961%20785" value=3D"+31610961785" target=3D"_blank">+31 610 961 785</a> |=
 <a href=3D"mailto:[log in to unmask]" target=3D"_blank">[log in to unmask]</a>=
 | <a href=3D"http://www.surfsara.nl" target=3D"_blank">www.surfsara.nl</a>
        |</p>
      <p style=3D"margin-bottom:0in"><img alt=3D"" name=3D"13e78f023004808f=
_graphics1" height=3D"32" align=3D"left" border=3D"0" width=3D"85"><br>
      </p>
      <p style=3D"margin-bottom:0in"><br>
      </p>
    </div>
  </font></span></div>

</blockquote></div><br><br clear=3D"all"><br>-- <br>Tel. +49 89 289 14152
</div>

--047d7b33d9047d077c04dc08dadb--
=========================================================================
Date:         Tue, 7 May 2013 10:25:10 +0100
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Daniela Bauer <[log in to unmask]>
Subject:      Re: Atlas and SL6 WNs issue
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary=089e011767510438ce04dc1d6640
Message-ID:  <[log in to unmask]>

--089e011767510438ce04dc1d6640
Content-Type: text/plain; charset=ISO-8859-1

Hi,

Is there any input from Atlas on this ?

Cheers,
Daniela


On 6 May 2013 09:54, Rodney Walker <[log in to unmask]>wrote:

> Hi,
> I know the LRZ fix, but it is not nice. I will add something useful to the
> ticket.
>
> Cheers,
> Rod.
>
>
>
> On 6 May 2013 10:23, Paco Bernabe <[log in to unmask]> wrote:
>
>>  Hi everybody,
>>
>> A GGUS ticket (https://ggus.eu/ws/ticket_info.php?ticket=93242) to our
>> site (SARA-MATRIX) regarding a known issue about atlas analysis jobs
>> landing on SL6 WNs described in the wiki (
>> https://twiki.cern.ch/twiki/bin/view/Atlas/RPMCompatSLC6#compiling_packages_against_SLC5).
>> When upgrading our WNs to *CentOS6*, we read the whole wiki and
>> performed the necessary actions to avoid issues, but this code we cannot
>> change (Meaning that it is not a site responsibility):
>>
>> 	# for building slc5 binaries on slc6 host
>> 	macro CppSpecificFlags "" \
>> 	    ATLAS&host-slc6&target-slc5 " -D__USE_XOPEN2K8 "
>>
>>
>> So it looks that the macro doesn't get defined for *CentOS6* sites, but
>> only slc6 sites (At least at our site doesn't). Now, the user that opened
>> the ticket sent us a link to another GGUS ticket (
>> https://ggus.eu/ws/ticket_info.php?ticket=93271) of a site (LRZ-LMU)
>> that had the same problem and a workaround was found; this site has
>> installed SLES11 SP2 which is compatible to SL6. If anybody of LRZ-LMU gets
>> this email, could you let us know about that workaround? Is there any other
>> site that support ATLAS and with SL6 (Or compatible) that has also this
>> issue? Please, let us know.
>>
>> --
>>
>> Met vriendelijke groeten / Best regards,
>>
>> *Paco Bernabe*
>>
>> | Systemsprogrammer | SURFsara | Science Park 140 | 1098XG Amsterdam | T +31
>> 610 961 785 | [log in to unmask] | www.surfsara.nl |
>>
>>
>>
>>
>
>
> --
> Tel. +49 89 289 14152
>



-- 
Sent from the pit of despair

-----------------------------------------------------------
[log in to unmask]
HEP Group/Physics Dep
Imperial College
London, SW7 2BW
Tel: +44-(0)20-75947810
http://www.hep.ph.ic.ac.uk/~dbauer/

--089e011767510438ce04dc1d6640
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div><div><div>Hi,<br><br></div>Is there any input from At=
las on this ? <br><br></div>Cheers,<br></div>Daniela<br></div><div class=3D=
"gmail_extra"><br><br><div class=3D"gmail_quote">On 6 May 2013 09:54, Rodne=
y Walker <span dir=3D"ltr">&lt;<a href=3D"mailto:[log in to unmask]
uenchen.de" target=3D"_blank">[log in to unmask]</a>&gt;<=
/span> wrote:<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div><div>Hi,<br></div>I kn=
ow the LRZ fix, but it is not nice. I will add something useful to the tick=
et.<br>

<br>Cheers,<br></div>Rod.<br><div><span></span><div><span><br></span></div>=
</div>
</div><div class=3D"gmail_extra"><div><div class=3D"h5"><br><br><div class=
=3D"gmail_quote">On 6 May 2013 10:23, Paco Bernabe <span dir=3D"ltr">&lt;<a=
 href=3D"mailto:[log in to unmask]" target=3D"_blank">[log in to unmask]</a>&gt=
;</span> wrote:<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">

 =20
   =20
 =20
  <div text=3D"#000000" bgcolor=3D"#FFFFFF">
    Hi everybody,<br>
    <br>
    A GGUS ticket (<a href=3D"https://ggus.eu/ws/ticket_info.php?ticket=3D9=
3242" target=3D"_blank">https://ggus.eu/ws/ticket_info.php?ticket=3D93242</=
a>)
    to our site (SARA-MATRIX) regarding a known issue about atlas
    analysis jobs landing on SL6 WNs described in the wiki
    (<a href=3D"https://twiki.cern.ch/twiki/bin/view/Atlas/RPMCompatSLC6#co=
mpiling_packages_against_SLC5" target=3D"_blank">https://twiki.cern.ch/twik=
i/bin/view/Atlas/RPMCompatSLC6#compiling_packages_against_SLC5</a>).
    When upgrading our WNs to <b>CentOS6</b>, we read the whole wiki
    and performed the necessary actions to avoid issues, but this code
    we cannot change (Meaning that it is not a site responsibility): <br>
    <pre>	# for building slc5 binaries on slc6 host
	macro CppSpecificFlags &quot;&quot; \
	    ATLAS&amp;host-slc6&amp;target-slc5 &quot; -D__USE_XOPEN2K8 &quot;</pr=
e>
    <br>
    So it looks that the macro doesn&#39;t get defined for <b>CentOS6</b>
    sites, but only slc6 sites (At least at our site doesn&#39;t). Now, the
    user that opened the ticket sent us a link to another GGUS ticket (<a h=
ref=3D"https://ggus.eu/ws/ticket_info.php?ticket=3D93271" target=3D"_blank"=
>https://ggus.eu/ws/ticket_info.php?ticket=3D93271</a>)
    of a site (LRZ-LMU) that had the same problem and a workaround was
    found; this site has installed SLES11 SP2 which is compatible to
    SL6. If anybody of LRZ-LMU gets this email, could you let us know
    about that workaround? Is there any other site that support ATLAS
    and with SL6 (Or compatible) that has also this issue? Please, let
    us know.<span><font color=3D"#888888"><br>
    <br>
    <div>-- <br>
     =20
     =20
     =20
     =20
     =20
     =20
     =20
     =20
      <p style=3D"margin-bottom:0in">Met vriendelijke groeten / Best
        regards,</p>
      <p style=3D"margin-bottom:0in">
       =20
        <b>Paco Bernabe</b></p>
      <p style=3D"margin-bottom:0in">| Systemsprogrammer | SURFsara |
        Science Park 140 | 1098XG Amsterdam | T <a href=3D"tel:%2B31%20610%=
20961%20785" value=3D"+31610961785" target=3D"_blank">+31 610 961 785</a> |=
 <a href=3D"mailto:[log in to unmask]" target=3D"_blank">[log in to unmask]</a>=
 | <a href=3D"http://www.surfsara.nl" target=3D"_blank">www.surfsara.nl</a>
        |</p>
      <p style=3D"margin-bottom:0in"><img alt=3D"" name=3D"13e790d00a8ac6c2=
_13e78f023004808f_graphics1" border=3D"0" height=3D"32" width=3D"85" align=
=3D"left"><br>
      </p>
      <p style=3D"margin-bottom:0in"><br>
      </p>
    </div>
  </font></span></div>

</blockquote></div><br><br clear=3D"all"><br></div></div><span class=3D"HOE=
nZb"><font color=3D"#888888">-- <br>Tel. <a href=3D"tel:%2B49%2089%20289%20=
14152" value=3D"+498928914152" target=3D"_blank">+49 89 289 14152</a>
</font></span></div>
</blockquote></div><br><br clear=3D"all"><br>-- <br><div dir=3D"ltr">Sent f=
rom the pit of despair<br><br>---------------------------------------------=
--------------<br><a href=3D"mailto:[log in to unmask]" target=3D=
"_blank">[log in to unmask]</a><br>

HEP Group/Physics Dep<br>Imperial College<br>London, SW7 2BW<br>Tel: +44-(0=
)20-75947810<br><a href=3D"http://www.hep.ph.ic.ac.uk/~dbauer/" target=3D"_=
blank">http://www.hep.ph.ic.ac.uk/~dbauer/</a></div>
</div>

--089e011767510438ce04dc1d6640--
=========================================================================
Date:         Tue, 7 May 2013 10:29:46 +0100
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Alessandra Forti <[log in to unmask]>
Subject:      Re: Atlas and SL6 WNs issue
Comments: cc: Daniela Bauer <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: multipart/alternative;
              boundary="------------040207060200090401090209"
Message-ID:  <[log in to unmask]>

--------------040207060200090401090209
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed
Content-Transfer-Encoding: 7bit

Hi Daniela,

Rod updated the ticket with a question yesterday. You might want to 
subscribe the ticket.

cheers
alessandra

On 07/05/2013 10:25, Daniela Bauer wrote:
> Hi,
>
> Is there any input from Atlas on this ?
>
> Cheers,
> Daniela
>
>
> On 6 May 2013 09:54, Rodney Walker 
> <[log in to unmask] 
> <mailto:[log in to unmask]>> wrote:
>
>     Hi,
>     I know the LRZ fix, but it is not nice. I will add something
>     useful to the ticket.
>
>     Cheers,
>     Rod.
>
>
>
>     On 6 May 2013 10:23, Paco Bernabe <[log in to unmask]
>     <mailto:[log in to unmask]>> wrote:
>
>         Hi everybody,
>
>         A GGUS ticket
>         (https://ggus.eu/ws/ticket_info.php?ticket=93242) to our site
>         (SARA-MATRIX) regarding a known issue about atlas analysis
>         jobs landing on SL6 WNs described in the wiki
>         (https://twiki.cern.ch/twiki/bin/view/Atlas/RPMCompatSLC6#compiling_packages_against_SLC5).
>         When upgrading our WNs to *CentOS6*, we read the whole wiki
>         and performed the necessary actions to avoid issues, but this
>         code we cannot change (Meaning that it is not a site
>         responsibility):
>
>         	# for building slc5 binaries on slc6 host
>         	macro CppSpecificFlags "" \
>         	    ATLAS&host-slc6&target-slc5 " -D__USE_XOPEN2K8 "
>
>
>         So it looks that the macro doesn't get defined for *CentOS6*
>         sites, but only slc6 sites (At least at our site doesn't).
>         Now, the user that opened the ticket sent us a link to another
>         GGUS ticket (https://ggus.eu/ws/ticket_info.php?ticket=93271)
>         of a site (LRZ-LMU) that had the same problem and a workaround
>         was found; this site has installed SLES11 SP2 which is
>         compatible to SL6. If anybody of LRZ-LMU gets this email,
>         could you let us know about that workaround? Is there any
>         other site that support ATLAS and with SL6 (Or compatible)
>         that has also this issue? Please, let us know.
>
>         -- 
>
>         Met vriendelijke groeten / Best regards,
>
>         *Paco Bernabe*
>
>         | Systemsprogrammer | SURFsara | Science Park 140 | 1098XG
>         Amsterdam | T +31 610 961 785 <tel:%2B31%20610%20961%20785> |
>         [log in to unmask] <mailto:[log in to unmask]> | www.surfsara.nl
>         <http://www.surfsara.nl> |
>
>
>
>
>
>
>     -- 
>     Tel. +49 89 289 14152 <tel:%2B49%2089%20289%2014152>
>
>
>
>
> -- 
> Sent from the pit of despair
>
> -----------------------------------------------------------
> [log in to unmask] <mailto:[log in to unmask]>
> HEP Group/Physics Dep
> Imperial College
> London, SW7 2BW
> Tel: +44-(0)20-75947810
> http://www.hep.ph.ic.ac.uk/~dbauer/ 
> <http://www.hep.ph.ic.ac.uk/%7Edbauer/>


-- 
Facts aren't facts if they come from the wrong people. (Paul Krugman)


--------------040207060200090401090209
Content-Type: text/html; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit

<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">Hi Daniela,<br>
      <br>
      Rod updated the ticket with a question yesterday. You might want
      to subscribe the ticket.<br>
      <br>
      cheers<br>
      alessandra<br>
      <br>
      On 07/05/2013 10:25, Daniela Bauer wrote:<br>
    </div>
    <blockquote
cite="mid:[log in to unmask]"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html;
        charset=ISO-8859-1">
      <div dir="ltr">
        <div>
          <div>
            <div>Hi,<br>
              <br>
            </div>
            Is there any input from Atlas on this ? <br>
            <br>
          </div>
          Cheers,<br>
        </div>
        Daniela<br>
      </div>
      <div class="gmail_extra"><br>
        <br>
        <div class="gmail_quote">On 6 May 2013 09:54, Rodney Walker <span
            dir="ltr">&lt;<a moz-do-not-send="true"
              href="mailto:[log in to unmask]"
              target="_blank">[log in to unmask]</a>&gt;</span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div dir="ltr">
              <div>
                <div>Hi,<br>
                </div>
                I know the LRZ fix, but it is not nice. I will add
                something useful to the ticket.<br>
                <br>
                Cheers,<br>
              </div>
              Rod.<br>
              <div><span></span>
                <div><span><br>
                  </span></div>
              </div>
            </div>
            <div class="gmail_extra">
              <div>
                <div class="h5"><br>
                  <br>
                  <div class="gmail_quote">On 6 May 2013 10:23, Paco
                    Bernabe <span dir="ltr">&lt;<a
                        moz-do-not-send="true"
                        href="mailto:[log in to unmask]" target="_blank">[log in to unmask]</a>&gt;</span>
                    wrote:<br>
                    <blockquote class="gmail_quote" style="margin:0 0 0
                      .8ex;border-left:1px #ccc solid;padding-left:1ex">
                      <div text="#000000" bgcolor="#FFFFFF"> Hi
                        everybody,<br>
                        <br>
                        A GGUS ticket (<a moz-do-not-send="true"
                          href="https://ggus.eu/ws/ticket_info.php?ticket=93242"
                          target="_blank">https://ggus.eu/ws/ticket_info.php?ticket=93242</a>)
                        to our site (SARA-MATRIX) regarding a known
                        issue about atlas analysis jobs landing on SL6
                        WNs described in the wiki (<a
                          moz-do-not-send="true"
href="https://twiki.cern.ch/twiki/bin/view/Atlas/RPMCompatSLC6#compiling_packages_against_SLC5"
                          target="_blank">https://twiki.cern.ch/twiki/bin/view/Atlas/RPMCompatSLC6#compiling_packages_against_SLC5</a>).

                        When upgrading our WNs to <b>CentOS6</b>, we
                        read the whole wiki and performed the necessary
                        actions to avoid issues, but this code we cannot
                        change (Meaning that it is not a site
                        responsibility): <br>
                        <pre>	# for building slc5 binaries on slc6 host
	macro CppSpecificFlags "" \
	    ATLAS&amp;host-slc6&amp;target-slc5 " -D__USE_XOPEN2K8 "</pre>
                        <br>
                        So it looks that the macro doesn't get defined
                        for <b>CentOS6</b> sites, but only slc6 sites
                        (At least at our site doesn't). Now, the user
                        that opened the ticket sent us a link to another
                        GGUS ticket (<a moz-do-not-send="true"
                          href="https://ggus.eu/ws/ticket_info.php?ticket=93271"
                          target="_blank">https://ggus.eu/ws/ticket_info.php?ticket=93271</a>)
                        of a site (LRZ-LMU) that had the same problem
                        and a workaround was found; this site has
                        installed SLES11 SP2 which is compatible to SL6.
                        If anybody of LRZ-LMU gets this email, could you
                        let us know about that workaround? Is there any
                        other site that support ATLAS and with SL6 (Or
                        compatible) that has also this issue? Please,
                        let us know.<span><font color="#888888"><br>
                            <br>
                            <div>-- <br>
                              <p style="margin-bottom:0in">Met
                                vriendelijke groeten / Best regards,</p>
                              <p style="margin-bottom:0in"> <b>Paco
                                  Bernabe</b></p>
                              <p style="margin-bottom:0in">|
                                Systemsprogrammer | SURFsara | Science
                                Park 140 | 1098XG Amsterdam | T <a
                                  moz-do-not-send="true"
                                  href="tel:%2B31%20610%20961%20785"
                                  value="+31610961785" target="_blank">+31
                                  610 961 785</a> | <a
                                  moz-do-not-send="true"
                                  href="mailto:[log in to unmask]"
                                  target="_blank">[log in to unmask]</a> |
                                <a moz-do-not-send="true"
                                  href="http://www.surfsara.nl"
                                  target="_blank">www.surfsara.nl</a> |</p>
                              <p style="margin-bottom:0in"><img
                                  moz-do-not-send="true" alt=""
                                  name="13e790d00a8ac6c2_13e78f023004808f_graphics1"
                                  align="left" border="0" height="32"
                                  width="85"><br>
                              </p>
                              <p style="margin-bottom:0in"><br>
                              </p>
                            </div>
                          </font></span></div>
                    </blockquote>
                  </div>
                  <br>
                  <br clear="all">
                  <br>
                </div>
              </div>
              <span class="HOEnZb"><font color="#888888">-- <br>
                  Tel. <a moz-do-not-send="true"
                    href="tel:%2B49%2089%20289%2014152"
                    value="+498928914152" target="_blank">+49 89 289
                    14152</a>
                </font></span></div>
          </blockquote>
        </div>
        <br>
        <br clear="all">
        <br>
        -- <br>
        <div dir="ltr">Sent from the pit of despair<br>
          <br>
          -----------------------------------------------------------<br>
          <a moz-do-not-send="true"
            href="mailto:[log in to unmask]" target="_blank">[log in to unmask]</a><br>
          HEP Group/Physics Dep<br>
          Imperial College<br>
          London, SW7 2BW<br>
          Tel: +44-(0)20-75947810<br>
          <a moz-do-not-send="true"
            href="http://www.hep.ph.ic.ac.uk/%7Edbauer/" target="_blank">http://www.hep.ph.ic.ac.uk/~dbauer/</a></div>
      </div>
    </blockquote>
    <br>
    <br>
    <pre class="moz-signature" cols="72">-- 
Facts aren't facts if they come from the wrong people. (Paul Krugman)
</pre>
  </body>
</html>

--------------040207060200090401090209--
=========================================================================
Date:         Tue, 7 May 2013 10:36:31 +0100
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Daniela Bauer <[log in to unmask]>
Subject:      Re: Atlas and SL6 WNs issue
Comments: To: Alessandra Forti <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary=089e0160c53aa86d0904dc1d8eb9
Message-ID:  <[log in to unmask]>

--089e0160c53aa86d0904dc1d8eb9
Content-Type: text/plain; charset=ISO-8859-1

Hi Alessandra,

I was more hoping for a 'Atlas is aware of the issue, a fix is forthcoming'
as it seems to suggest an Atlas specific problem.

The ticket is almost incomprehensible (can someone fix those html junk
characters ?)

Cheers,
Daniela





On 7 May 2013 10:29, Alessandra Forti <[log in to unmask]> wrote:

>  Hi Daniela,
>
> Rod updated the ticket with a question yesterday. You might want to
> subscribe the ticket.
>
> cheers
> alessandra
>
>
> On 07/05/2013 10:25, Daniela Bauer wrote:
>
>   Hi,
>
>  Is there any input from Atlas on this ?
>
>  Cheers,
>  Daniela
>
>
> On 6 May 2013 09:54, Rodney Walker <[log in to unmask]>wrote:
>
>>  Hi,
>>  I know the LRZ fix, but it is not nice. I will add something useful to
>> the ticket.
>>
>> Cheers,
>>  Rod.
>>
>>
>>
>> On 6 May 2013 10:23, Paco Bernabe <[log in to unmask]> wrote:
>>
>>>  Hi everybody,
>>>
>>> A GGUS ticket (https://ggus.eu/ws/ticket_info.php?ticket=93242) to our
>>> site (SARA-MATRIX) regarding a known issue about atlas analysis jobs
>>> landing on SL6 WNs described in the wiki (
>>> https://twiki.cern.ch/twiki/bin/view/Atlas/RPMCompatSLC6#compiling_packages_against_SLC5).
>>> When upgrading our WNs to *CentOS6*, we read the whole wiki and
>>> performed the necessary actions to avoid issues, but this code we cannot
>>> change (Meaning that it is not a site responsibility):
>>>
>>> 	# for building slc5 binaries on slc6 host
>>> 	macro CppSpecificFlags "" \
>>> 	    ATLAS&host-slc6&target-slc5 " -D__USE_XOPEN2K8 "
>>>
>>>
>>> So it looks that the macro doesn't get defined for *CentOS6* sites, but
>>> only slc6 sites (At least at our site doesn't). Now, the user that opened
>>> the ticket sent us a link to another GGUS ticket (
>>> https://ggus.eu/ws/ticket_info.php?ticket=93271) of a site (LRZ-LMU)
>>> that had the same problem and a workaround was found; this site has
>>> installed SLES11 SP2 which is compatible to SL6. If anybody of LRZ-LMU gets
>>> this email, could you let us know about that workaround? Is there any other
>>> site that support ATLAS and with SL6 (Or compatible) that has also this
>>> issue? Please, let us know.
>>>
>>> --
>>>
>>> Met vriendelijke groeten / Best regards,
>>>
>>> *Paco Bernabe*
>>>
>>> | Systemsprogrammer | SURFsara | Science Park 140 | 1098XG Amsterdam | T +31
>>> 610 961 785 <%2B31%20610%20961%20785> | [log in to unmask] |
>>> www.surfsara.nl |
>>>
>>>
>>>
>>>
>>
>>
>>  --
>> Tel. +49 89 289 14152 <%2B49%2089%20289%2014152>
>>
>
>
>
> --
> Sent from the pit of despair
>
> -----------------------------------------------------------
> [log in to unmask]
> HEP Group/Physics Dep
> Imperial College
> London, SW7 2BW
> Tel: +44-(0)20-75947810
> http://www.hep.ph.ic.ac.uk/~dbauer/
>
>
>
> --
> Facts aren't facts if they come from the wrong people. (Paul Krugman)
>
>


-- 
Sent from the pit of despair

-----------------------------------------------------------
[log in to unmask]
HEP Group/Physics Dep
Imperial College
London, SW7 2BW
Tel: +44-(0)20-75947810
http://www.hep.ph.ic.ac.uk/~dbauer/

--089e0160c53aa86d0904dc1d8eb9
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div><div><div>Hi Alessandra,<br><br>I was more hoping for=
 a &#39;Atlas is aware of the issue, a fix is forthcoming&#39; as it seems =
to suggest an Atlas specific problem.<br><br></div>The ticket is almost inc=
omprehensible (can someone fix those html junk characters ?)<br>

<br></div>Cheers,<br></div>Daniela<br><div><div><div><div><br><br><br></div=
></div></div></div></div><div class=3D"gmail_extra"><br><br><div class=3D"g=
mail_quote">On 7 May 2013 10:29, Alessandra Forti <span dir=3D"ltr">&lt;<a =
href=3D"mailto:[log in to unmask]" target=3D"_blank">Alessandra.Forti=
@cern.ch</a>&gt;</span> wrote:<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
 =20
   =20
 =20
  <div bgcolor=3D"#FFFFFF" text=3D"#000000">
    <div>Hi Daniela,<br>
      <br>
      Rod updated the ticket with a question yesterday. You might want
      to subscribe the ticket.<br>
      <br>
      cheers<br>
      alessandra<div><div class=3D"h5"><br>
      <br>
      On 07/05/2013 10:25, Daniela Bauer wrote:<br>
    </div></div></div><div><div class=3D"h5">
    <blockquote type=3D"cite">
     =20
      <div dir=3D"ltr">
        <div>
          <div>
            <div>Hi,<br>
              <br>
            </div>
            Is there any input from Atlas on this ? <br>
            <br>
          </div>
          Cheers,<br>
        </div>
        Daniela<br>
      </div>
      <div class=3D"gmail_extra"><br>
        <br>
        <div class=3D"gmail_quote">On 6 May 2013 09:54, Rodney Walker <span=
 dir=3D"ltr">&lt;<a href=3D"mailto:[log in to unmask]" ta=
rget=3D"_blank">[log in to unmask]</a>&gt;</span>
          wrote:<br>
          <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;bord=
er-left:1px #ccc solid;padding-left:1ex">
            <div dir=3D"ltr">
              <div>
                <div>Hi,<br>
                </div>
                I know the LRZ fix, but it is not nice. I will add
                something useful to the ticket.<br>
                <br>
                Cheers,<br>
              </div>
              Rod.<br>
              <div><span></span>
                <div><span><br>
                  </span></div>
              </div>
            </div>
            <div class=3D"gmail_extra">
              <div>
                <div><br>
                  <br>
                  <div class=3D"gmail_quote">On 6 May 2013 10:23, Paco
                    Bernabe <span dir=3D"ltr">&lt;<a href=3D"mailto:paco@su=
rfsara.nl" target=3D"_blank">[log in to unmask]</a>&gt;</span>
                    wrote:<br>
                    <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0=
 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                      <div text=3D"#000000" bgcolor=3D"#FFFFFF"> Hi
                        everybody,<br>
                        <br>
                        A GGUS ticket (<a href=3D"https://ggus.eu/ws/ticket=
_info.php?ticket=3D93242" target=3D"_blank">https://ggus.eu/ws/ticket_info.=
php?ticket=3D93242</a>)
                        to our site (SARA-MATRIX) regarding a known
                        issue about atlas analysis jobs landing on SL6
                        WNs described in the wiki (<a href=3D"https://twiki=
.cern.ch/twiki/bin/view/Atlas/RPMCompatSLC6#compiling_packages_against_SLC5=
" target=3D"_blank">https://twiki.cern.ch/twiki/bin/view/Atlas/RPMCompatSLC=
6#compiling_packages_against_SLC5</a>).

                        When upgrading our WNs to <b>CentOS6</b>, we
                        read the whole wiki and performed the necessary
                        actions to avoid issues, but this code we cannot
                        change (Meaning that it is not a site
                        responsibility): <br>
                        <pre>	# for building slc5 binaries on slc6 host
	macro CppSpecificFlags &quot;&quot; \
	    ATLAS&amp;host-slc6&amp;target-slc5 &quot; -D__USE_XOPEN2K8 &quot;</pr=
e>
                        <br>
                        So it looks that the macro doesn&#39;t get defined
                        for <b>CentOS6</b> sites, but only slc6 sites
                        (At least at our site doesn&#39;t). Now, the user
                        that opened the ticket sent us a link to another
                        GGUS ticket (<a href=3D"https://ggus.eu/ws/ticket_i=
nfo.php?ticket=3D93271" target=3D"_blank">https://ggus.eu/ws/ticket_info.ph=
p?ticket=3D93271</a>)
                        of a site (LRZ-LMU) that had the same problem
                        and a workaround was found; this site has
                        installed SLES11 SP2 which is compatible to SL6.
                        If anybody of LRZ-LMU gets this email, could you
                        let us know about that workaround? Is there any
                        other site that support ATLAS and with SL6 (Or
                        compatible) that has also this issue? Please,
                        let us know.<span><font color=3D"#888888"><br>
                            <br>
                            <div>-- <br>
                              <p style=3D"margin-bottom:0in">Met
                                vriendelijke groeten / Best regards,</p>
                              <p style=3D"margin-bottom:0in"> <b>Paco
                                  Bernabe</b></p>
                              <p style=3D"margin-bottom:0in">|
                                Systemsprogrammer | SURFsara | Science
                                Park 140 | 1098XG Amsterdam | T <a href=3D"=
tel:%2B31%20610%20961%20785" value=3D"+31610961785" target=3D"_blank">+31
                                  610 961 785</a> | <a href=3D"mailto:paco@=
surfsara.nl" target=3D"_blank">[log in to unmask]</a> |
                                <a href=3D"http://www.surfsara.nl" target=
=3D"_blank">www.surfsara.nl</a> |</p>
                              <p style=3D"margin-bottom:0in"><img alt=3D"" =
name=3D"13e7e534eb5b7515_13e790d00a8ac6c2_13e78f023004808f_graphics1" borde=
r=3D"0" height=3D"32" width=3D"85" align=3D"left"><br>
                              </p>
                              <p style=3D"margin-bottom:0in"><br>
                              </p>
                            </div>
                          </font></span></div>
                    </blockquote>
                  </div>
                  <br>
                  <br clear=3D"all">
                  <br>
                </div>
              </div>
              <span><font color=3D"#888888">-- <br>
                  Tel. <a href=3D"tel:%2B49%2089%20289%2014152" value=3D"+4=
98928914152" target=3D"_blank">+49 89 289
                    14152</a>
                </font></span></div>
          </blockquote>
        </div>
        <br>
        <br clear=3D"all">
        <br>
        -- <br>
        <div dir=3D"ltr">Sent from the pit of despair<br>
          <br>
          -----------------------------------------------------------<br>
          <a href=3D"mailto:[log in to unmask]" target=3D"_blank"=
>[log in to unmask]</a><br>
          HEP Group/Physics Dep<br>
          Imperial College<br>
          London, SW7 2BW<br>
          Tel: <a href=3D"tel:%2B44-%280%2920-75947810" value=3D"+442075947=
810" target=3D"_blank">+44-(0)20-75947810</a><br>
          <a href=3D"http://www.hep.ph.ic.ac.uk/%7Edbauer/" target=3D"_blan=
k">http://www.hep.ph.ic.ac.uk/~dbauer/</a></div>
      </div>
    </blockquote>
    <br>
    <br>
    </div></div><span class=3D"HOEnZb"><font color=3D"#888888"><pre cols=3D=
"72">--=20
Facts aren&#39;t facts if they come from the wrong people. (Paul Krugman)
</pre>
  </font></span></div>

</blockquote></div><br><br clear=3D"all"><br>-- <br><div dir=3D"ltr">Sent f=
rom the pit of despair<br><br>---------------------------------------------=
--------------<br><a href=3D"mailto:[log in to unmask]" target=3D=
"_blank">[log in to unmask]</a><br>

HEP Group/Physics Dep<br>Imperial College<br>London, SW7 2BW<br>Tel: +44-(0=
)20-75947810<br><a href=3D"http://www.hep.ph.ic.ac.uk/~dbauer/" target=3D"_=
blank">http://www.hep.ph.ic.ac.uk/~dbauer/</a></div>
</div>

--089e0160c53aa86d0904dc1d8eb9--
=========================================================================
Date:         Tue, 7 May 2013 10:48:13 +0100
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Alessandra Forti <[log in to unmask]>
Subject:      Re: Atlas and SL6 WNs issue
Comments: To: Daniela Bauer <[log in to unmask]>
Comments: cc: Emil Obreshkov <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: multipart/alternative;
              boundary="------------050103050102090904090801"
Message-ID:  <[log in to unmask]>

--------------050103050102090904090801
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed
Content-Transfer-Encoding: 7bit

Well Atlas is aware of the issue I don't know when the fix will be 
available.

 From the Rod entry it seems cmt reads the architecture from some OS 
file and the CentOS architecture is not foreseen. Either the CeontOS 
sites will have to hack it and replace the architecture or atlas has to 
add a CentOS arch. The sw librarian is CC'd in the ticket and now also 
here.

cheers
alessandra


On 07/05/2013 10:36, Daniela Bauer wrote:
> Hi Alessandra,
>
> I was more hoping for a 'Atlas is aware of the issue, a fix is 
> forthcoming' as it seems to suggest an Atlas specific problem.
>
> The ticket is almost incomprehensible (can someone fix those html junk 
> characters ?)
>
> Cheers,
> Daniela
>
>
>
>
>
> On 7 May 2013 10:29, Alessandra Forti <[log in to unmask] 
> <mailto:[log in to unmask]>> wrote:
>
>     Hi Daniela,
>
>     Rod updated the ticket with a question yesterday. You might want
>     to subscribe the ticket.
>
>     cheers
>     alessandra
>
>
>     On 07/05/2013 10:25, Daniela Bauer wrote:
>>     Hi,
>>
>>     Is there any input from Atlas on this ?
>>
>>     Cheers,
>>     Daniela
>>
>>
>>     On 6 May 2013 09:54, Rodney Walker
>>     <[log in to unmask]
>>     <mailto:[log in to unmask]>> wrote:
>>
>>         Hi,
>>         I know the LRZ fix, but it is not nice. I will add something
>>         useful to the ticket.
>>
>>         Cheers,
>>         Rod.
>>
>>
>>
>>         On 6 May 2013 10:23, Paco Bernabe <[log in to unmask]
>>         <mailto:[log in to unmask]>> wrote:
>>
>>             Hi everybody,
>>
>>             A GGUS ticket
>>             (https://ggus.eu/ws/ticket_info.php?ticket=93242) to our
>>             site (SARA-MATRIX) regarding a known issue about atlas
>>             analysis jobs landing on SL6 WNs described in the wiki
>>             (https://twiki.cern.ch/twiki/bin/view/Atlas/RPMCompatSLC6#compiling_packages_against_SLC5).
>>             When upgrading our WNs to *CentOS6*, we read the whole
>>             wiki and performed the necessary actions to avoid issues,
>>             but this code we cannot change (Meaning that it is not a
>>             site responsibility):
>>
>>             	# for building slc5 binaries on slc6 host
>>             	macro CppSpecificFlags "" \
>>             	    ATLAS&host-slc6&target-slc5 " -D__USE_XOPEN2K8 "
>>
>>
>>             So it looks that the macro doesn't get defined for
>>             *CentOS6* sites, but only slc6 sites (At least at our
>>             site doesn't). Now, the user that opened the ticket sent
>>             us a link to another GGUS ticket
>>             (https://ggus.eu/ws/ticket_info.php?ticket=93271) of a
>>             site (LRZ-LMU) that had the same problem and a workaround
>>             was found; this site has installed SLES11 SP2 which is
>>             compatible to SL6. If anybody of LRZ-LMU gets this email,
>>             could you let us know about that workaround? Is there any
>>             other site that support ATLAS and with SL6 (Or
>>             compatible) that has also this issue? Please, let us know.
>>
>>             -- 
>>
>>             Met vriendelijke groeten / Best regards,
>>
>>             *Paco Bernabe*
>>
>>             | Systemsprogrammer | SURFsara | Science Park 140 |
>>             1098XG Amsterdam | T +31 610 961 785
>>             <tel:%2B31%20610%20961%20785> | [log in to unmask]
>>             <mailto:[log in to unmask]> | www.surfsara.nl
>>             <http://www.surfsara.nl> |
>>
>>
>>
>>
>>
>>
>>         -- 
>>         Tel. +49 89 289 14152 <tel:%2B49%2089%20289%2014152>
>>
>>
>>
>>
>>     -- 
>>     Sent from the pit of despair
>>
>>     -----------------------------------------------------------
>>     [log in to unmask] <mailto:[log in to unmask]>
>>     HEP Group/Physics Dep
>>     Imperial College
>>     London, SW7 2BW
>>     Tel: +44-(0)20-75947810 <tel:%2B44-%280%2920-75947810>
>>     http://www.hep.ph.ic.ac.uk/~dbauer/
>>     <http://www.hep.ph.ic.ac.uk/%7Edbauer/>
>
>
>     -- 
>     Facts aren't facts if they come from the wrong people. (Paul Krugman)
>
>
>
>
> -- 
> Sent from the pit of despair
>
> -----------------------------------------------------------
> [log in to unmask] <mailto:[log in to unmask]>
> HEP Group/Physics Dep
> Imperial College
> London, SW7 2BW
> Tel: +44-(0)20-75947810
> http://www.hep.ph.ic.ac.uk/~dbauer/ 
> <http://www.hep.ph.ic.ac.uk/%7Edbauer/>


-- 
Facts aren't facts if they come from the wrong people. (Paul Krugman)


--------------050103050102090904090801
Content-Type: text/html; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit

<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">Well Atlas is aware of the issue I
      don't know when the fix will be available.<br>
      <br>
      From the Rod entry it seems cmt reads the architecture from some
      OS file and the CentOS architecture is not foreseen. Either the
      CeontOS sites will have to hack it and replace the architecture or
      atlas has to add a CentOS arch. The sw librarian is CC'd in the
      ticket and now also here. <br>
      <br>
      cheers<br>
      alessandra<br>
      <br>
      <br>
      On 07/05/2013 10:36, Daniela Bauer wrote:<br>
    </div>
    <blockquote
cite="mid:[log in to unmask]"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html;
        charset=ISO-8859-1">
      <div dir="ltr">
        <div>
          <div>
            <div>Hi Alessandra,<br>
              <br>
              I was more hoping for a 'Atlas is aware of the issue, a
              fix is forthcoming' as it seems to suggest an Atlas
              specific problem.<br>
              <br>
            </div>
            The ticket is almost incomprehensible (can someone fix those
            html junk characters ?)<br>
            <br>
          </div>
          Cheers,<br>
        </div>
        Daniela<br>
        <div>
          <div>
            <div>
              <div><br>
                <br>
                <br>
              </div>
            </div>
          </div>
        </div>
      </div>
      <div class="gmail_extra"><br>
        <br>
        <div class="gmail_quote">On 7 May 2013 10:29, Alessandra Forti <span
            dir="ltr">&lt;<a moz-do-not-send="true"
              href="mailto:[log in to unmask]" target="_blank">[log in to unmask]</a>&gt;</span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div bgcolor="#FFFFFF" text="#000000">
              <div>Hi Daniela,<br>
                <br>
                Rod updated the ticket with a question yesterday. You
                might want to subscribe the ticket.<br>
                <br>
                cheers<br>
                alessandra
                <div>
                  <div class="h5"><br>
                    <br>
                    On 07/05/2013 10:25, Daniela Bauer wrote:<br>
                  </div>
                </div>
              </div>
              <div>
                <div class="h5">
                  <blockquote type="cite">
                    <div dir="ltr">
                      <div>
                        <div>
                          <div>Hi,<br>
                            <br>
                          </div>
                          Is there any input from Atlas on this ? <br>
                          <br>
                        </div>
                        Cheers,<br>
                      </div>
                      Daniela<br>
                    </div>
                    <div class="gmail_extra"><br>
                      <br>
                      <div class="gmail_quote">On 6 May 2013 09:54,
                        Rodney Walker <span dir="ltr">&lt;<a
                            moz-do-not-send="true"
                            href="mailto:[log in to unmask]"
                            target="_blank">[log in to unmask]</a>&gt;</span>
                        wrote:<br>
                        <blockquote class="gmail_quote" style="margin:0
                          0 0 .8ex;border-left:1px #ccc
                          solid;padding-left:1ex">
                          <div dir="ltr">
                            <div>
                              <div>Hi,<br>
                              </div>
                              I know the LRZ fix, but it is not nice. I
                              will add something useful to the ticket.<br>
                              <br>
                              Cheers,<br>
                            </div>
                            Rod.<br>
                            <div><span></span>
                              <div><span><br>
                                </span></div>
                            </div>
                          </div>
                          <div class="gmail_extra">
                            <div>
                              <div><br>
                                <br>
                                <div class="gmail_quote">On 6 May 2013
                                  10:23, Paco Bernabe <span dir="ltr">&lt;<a
                                      moz-do-not-send="true"
                                      href="mailto:[log in to unmask]"
                                      target="_blank">[log in to unmask]</a>&gt;</span>
                                  wrote:<br>
                                  <blockquote class="gmail_quote"
                                    style="margin:0 0 0
                                    .8ex;border-left:1px #ccc
                                    solid;padding-left:1ex">
                                    <div text="#000000"
                                      bgcolor="#FFFFFF"> Hi everybody,<br>
                                      <br>
                                      A GGUS ticket (<a
                                        moz-do-not-send="true"
                                        href="https://ggus.eu/ws/ticket_info.php?ticket=93242"
                                        target="_blank">https://ggus.eu/ws/ticket_info.php?ticket=93242</a>)
                                      to our site (SARA-MATRIX)
                                      regarding a known issue about
                                      atlas analysis jobs landing on SL6
                                      WNs described in the wiki (<a
                                        moz-do-not-send="true"
href="https://twiki.cern.ch/twiki/bin/view/Atlas/RPMCompatSLC6#compiling_packages_against_SLC5"
                                        target="_blank">https://twiki.cern.ch/twiki/bin/view/Atlas/RPMCompatSLC6#compiling_packages_against_SLC5</a>).


                                      When upgrading our WNs to <b>CentOS6</b>,
                                      we read the whole wiki and
                                      performed the necessary actions to
                                      avoid issues, but this code we
                                      cannot change (Meaning that it is
                                      not a site responsibility): <br>
                                      <pre>	# for building slc5 binaries on slc6 host
	macro CppSpecificFlags "" \
	    ATLAS&amp;host-slc6&amp;target-slc5 " -D__USE_XOPEN2K8 "</pre>
                                      <br>
                                      So it looks that the macro doesn't
                                      get defined for <b>CentOS6</b>
                                      sites, but only slc6 sites (At
                                      least at our site doesn't). Now,
                                      the user that opened the ticket
                                      sent us a link to another GGUS
                                      ticket (<a moz-do-not-send="true"
href="https://ggus.eu/ws/ticket_info.php?ticket=93271" target="_blank">https://ggus.eu/ws/ticket_info.php?ticket=93271</a>)
                                      of a site (LRZ-LMU) that had the
                                      same problem and a workaround was
                                      found; this site has installed
                                      SLES11 SP2 which is compatible to
                                      SL6. If anybody of LRZ-LMU gets
                                      this email, could you let us know
                                      about that workaround? Is there
                                      any other site that support ATLAS
                                      and with SL6 (Or compatible) that
                                      has also this issue? Please, let
                                      us know.<span><font
                                          color="#888888"><br>
                                          <br>
                                          <div>-- <br>
                                            <p style="margin-bottom:0in">Met

                                              vriendelijke groeten /
                                              Best regards,</p>
                                            <p style="margin-bottom:0in">
                                              <b>Paco Bernabe</b></p>
                                            <p style="margin-bottom:0in">|
                                              Systemsprogrammer |
                                              SURFsara | Science Park
                                              140 | 1098XG Amsterdam | T
                                              <a moz-do-not-send="true"
href="tel:%2B31%20610%20961%20785" value="+31610961785" target="_blank">+31

                                                610 961 785</a> | <a
                                                moz-do-not-send="true"
                                                href="mailto:[log in to unmask]"
                                                target="_blank">[log in to unmask]</a>
                                              | <a
                                                moz-do-not-send="true"
                                                href="http://www.surfsara.nl"
                                                target="_blank">www.surfsara.nl</a>
                                              |</p>
                                            <p style="margin-bottom:0in"><img
                                                moz-do-not-send="true"
                                                alt=""
                                                name="13e7e534eb5b7515_13e790d00a8ac6c2_13e78f023004808f_graphics1"
                                                align="left" border="0"
                                                height="32" width="85"><br>
                                            </p>
                                            <p style="margin-bottom:0in"><br>
                                            </p>
                                          </div>
                                        </font></span></div>
                                  </blockquote>
                                </div>
                                <br>
                                <br clear="all">
                                <br>
                              </div>
                            </div>
                            <span><font color="#888888">-- <br>
                                Tel. <a moz-do-not-send="true"
                                  href="tel:%2B49%2089%20289%2014152"
                                  value="+498928914152" target="_blank">+49
                                  89 289 14152</a> </font></span></div>
                        </blockquote>
                      </div>
                      <br>
                      <br clear="all">
                      <br>
                      -- <br>
                      <div dir="ltr">Sent from the pit of despair<br>
                        <br>
-----------------------------------------------------------<br>
                        <a moz-do-not-send="true"
                          href="mailto:[log in to unmask]"
                          target="_blank">[log in to unmask]</a><br>
                        HEP Group/Physics Dep<br>
                        Imperial College<br>
                        London, SW7 2BW<br>
                        Tel: <a moz-do-not-send="true"
                          href="tel:%2B44-%280%2920-75947810"
                          value="+442075947810" target="_blank">+44-(0)20-75947810</a><br>
                        <a moz-do-not-send="true"
                          href="http://www.hep.ph.ic.ac.uk/%7Edbauer/"
                          target="_blank">http://www.hep.ph.ic.ac.uk/~dbauer/</a></div>
                    </div>
                  </blockquote>
                  <br>
                  <br>
                </div>
              </div>
              <span class="HOEnZb"><font color="#888888">
                  <pre cols="72">-- 
Facts aren't facts if they come from the wrong people. (Paul Krugman)
</pre>
                </font></span></div>
          </blockquote>
        </div>
        <br>
        <br clear="all">
        <br>
        -- <br>
        <div dir="ltr">Sent from the pit of despair<br>
          <br>
          -----------------------------------------------------------<br>
          <a moz-do-not-send="true"
            href="mailto:[log in to unmask]" target="_blank">[log in to unmask]</a><br>
          HEP Group/Physics Dep<br>
          Imperial College<br>
          London, SW7 2BW<br>
          Tel: +44-(0)20-75947810<br>
          <a moz-do-not-send="true"
            href="http://www.hep.ph.ic.ac.uk/%7Edbauer/" target="_blank">http://www.hep.ph.ic.ac.uk/~dbauer/</a></div>
      </div>
    </blockquote>
    <br>
    <br>
    <pre class="moz-signature" cols="72">-- 
Facts aren't facts if they come from the wrong people. (Paul Krugman)
</pre>
  </body>
</html>

--------------050103050102090904090801--
=========================================================================
Date:         Tue, 7 May 2013 10:49:43 +0100
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Alessandra Forti <[log in to unmask]>
Subject:      Re: Atlas and SL6 WNs issue
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: multipart/alternative;
              boundary="------------040203090800010600000300"
Message-ID:  <[log in to unmask]>

--------------040203090800010600000300
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed
Content-Transfer-Encoding: 7bit

PS if only slc5 and slc6 are foreseen SL(-C) sites might have a similar 
problem.

cheers
alessandra

  On 07/05/2013 10:48, Alessandra Forti wrote:
> Well Atlas is aware of the issue I don't know when the fix will be 
> available.
>
> From the Rod entry it seems cmt reads the architecture from some OS 
> file and the CentOS architecture is not foreseen. Either the CeontOS 
> sites will have to hack it and replace the architecture or atlas has 
> to add a CentOS arch. The sw librarian is CC'd in the ticket and now 
> also here.
>
> cheers
> alessandra
>
>
> On 07/05/2013 10:36, Daniela Bauer wrote:
>> Hi Alessandra,
>>
>> I was more hoping for a 'Atlas is aware of the issue, a fix is 
>> forthcoming' as it seems to suggest an Atlas specific problem.
>>
>> The ticket is almost incomprehensible (can someone fix those html 
>> junk characters ?)
>>
>> Cheers,
>> Daniela
>>
>>
>>
>>
>>
>> On 7 May 2013 10:29, Alessandra Forti <[log in to unmask] 
>> <mailto:[log in to unmask]>> wrote:
>>
>>     Hi Daniela,
>>
>>     Rod updated the ticket with a question yesterday. You might want
>>     to subscribe the ticket.
>>
>>     cheers
>>     alessandra
>>
>>
>>     On 07/05/2013 10:25, Daniela Bauer wrote:
>>>     Hi,
>>>
>>>     Is there any input from Atlas on this ?
>>>
>>>     Cheers,
>>>     Daniela
>>>
>>>
>>>     On 6 May 2013 09:54, Rodney Walker
>>>     <[log in to unmask]
>>>     <mailto:[log in to unmask]>> wrote:
>>>
>>>         Hi,
>>>         I know the LRZ fix, but it is not nice. I will add something
>>>         useful to the ticket.
>>>
>>>         Cheers,
>>>         Rod.
>>>
>>>
>>>
>>>         On 6 May 2013 10:23, Paco Bernabe <[log in to unmask]
>>>         <mailto:[log in to unmask]>> wrote:
>>>
>>>             Hi everybody,
>>>
>>>             A GGUS ticket
>>>             (https://ggus.eu/ws/ticket_info.php?ticket=93242) to our
>>>             site (SARA-MATRIX) regarding a known issue about atlas
>>>             analysis jobs landing on SL6 WNs described in the wiki
>>>             (https://twiki.cern.ch/twiki/bin/view/Atlas/RPMCompatSLC6#compiling_packages_against_SLC5).
>>>             When upgrading our WNs to *CentOS6*, we read the whole
>>>             wiki and performed the necessary actions to avoid
>>>             issues, but this code we cannot change (Meaning that it
>>>             is not a site responsibility):
>>>
>>>             	# for building slc5 binaries on slc6 host
>>>             	macro CppSpecificFlags "" \
>>>             	    ATLAS&host-slc6&target-slc5 " -D__USE_XOPEN2K8 "
>>>
>>>
>>>             So it looks that the macro doesn't get defined for
>>>             *CentOS6* sites, but only slc6 sites (At least at our
>>>             site doesn't). Now, the user that opened the ticket sent
>>>             us a link to another GGUS ticket
>>>             (https://ggus.eu/ws/ticket_info.php?ticket=93271) of a
>>>             site (LRZ-LMU) that had the same problem and a
>>>             workaround was found; this site has installed SLES11 SP2
>>>             which is compatible to SL6. If anybody of LRZ-LMU gets
>>>             this email, could you let us know about that workaround?
>>>             Is there any other site that support ATLAS and with SL6
>>>             (Or compatible) that has also this issue? Please, let us
>>>             know.
>>>
>>>             -- 
>>>
>>>             Met vriendelijke groeten / Best regards,
>>>
>>>             *Paco Bernabe*
>>>
>>>             | Systemsprogrammer | SURFsara | Science Park 140 |
>>>             1098XG Amsterdam | T +31 610 961 785
>>>             <tel:%2B31%20610%20961%20785> | [log in to unmask]
>>>             <mailto:[log in to unmask]> | www.surfsara.nl
>>>             <http://www.surfsara.nl> |
>>>
>>>
>>>
>>>
>>>
>>>
>>>         -- 
>>>         Tel. +49 89 289 14152 <tel:%2B49%2089%20289%2014152>
>>>
>>>
>>>
>>>
>>>     -- 
>>>     Sent from the pit of despair
>>>
>>>     -----------------------------------------------------------
>>>     [log in to unmask] <mailto:[log in to unmask]>
>>>     HEP Group/Physics Dep
>>>     Imperial College
>>>     London, SW7 2BW
>>>     Tel: +44-(0)20-75947810 <tel:%2B44-%280%2920-75947810>
>>>     http://www.hep.ph.ic.ac.uk/~dbauer/
>>>     <http://www.hep.ph.ic.ac.uk/%7Edbauer/>
>>
>>
>>     -- 
>>     Facts aren't facts if they come from the wrong people. (Paul Krugman)
>>
>>
>>
>>
>> -- 
>> Sent from the pit of despair
>>
>> -----------------------------------------------------------
>> [log in to unmask] <mailto:[log in to unmask]>
>> HEP Group/Physics Dep
>> Imperial College
>> London, SW7 2BW
>> Tel: +44-(0)20-75947810
>> http://www.hep.ph.ic.ac.uk/~dbauer/ 
>> <http://www.hep.ph.ic.ac.uk/%7Edbauer/>
>
>
> -- 
> Facts aren't facts if they come from the wrong people. (Paul Krugman)


-- 
Facts aren't facts if they come from the wrong people. (Paul Krugman)


--------------040203090800010600000300
Content-Type: text/html; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit

<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">PS if only slc5 and slc6 are foreseen
      SL(-C) sites might have a similar problem.<br>
      <br>
      cheers<br>
      alessandra<br>
      <br>
      &nbsp;On 07/05/2013 10:48, Alessandra Forti wrote:<br>
    </div>
    <blockquote cite="mid:[log in to unmask]" type="cite">
      <meta http-equiv="Content-Type" content="text/html;
        charset=ISO-8859-1">
      <div class="moz-cite-prefix">Well Atlas is aware of the issue I
        don't know when the fix will be available.<br>
        <br>
        From the Rod entry it seems cmt reads the architecture from some
        OS file and the CentOS architecture is not foreseen. Either the
        CeontOS sites will have to hack it and replace the architecture
        or atlas has to add a CentOS arch. The sw librarian is CC'd in
        the ticket and now also here. <br>
        <br>
        cheers<br>
        alessandra<br>
        <br>
        <br>
        On 07/05/2013 10:36, Daniela Bauer wrote:<br>
      </div>
      <blockquote
cite="mid:[log in to unmask]"
        type="cite">
        <div dir="ltr">
          <div>
            <div>
              <div>Hi Alessandra,<br>
                <br>
                I was more hoping for a 'Atlas is aware of the issue, a
                fix is forthcoming' as it seems to suggest an Atlas
                specific problem.<br>
                <br>
              </div>
              The ticket is almost incomprehensible (can someone fix
              those html junk characters ?)<br>
              <br>
            </div>
            Cheers,<br>
          </div>
          Daniela<br>
          <div>
            <div>
              <div>
                <div><br>
                  <br>
                  <br>
                </div>
              </div>
            </div>
          </div>
        </div>
        <div class="gmail_extra"><br>
          <br>
          <div class="gmail_quote">On 7 May 2013 10:29, Alessandra Forti
            <span dir="ltr">&lt;<a moz-do-not-send="true"
                href="mailto:[log in to unmask]" target="_blank">[log in to unmask]</a>&gt;</span>
            wrote:<br>
            <blockquote class="gmail_quote" style="margin:0 0 0
              .8ex;border-left:1px #ccc solid;padding-left:1ex">
              <div bgcolor="#FFFFFF" text="#000000">
                <div>Hi Daniela,<br>
                  <br>
                  Rod updated the ticket with a question yesterday. You
                  might want to subscribe the ticket.<br>
                  <br>
                  cheers<br>
                  alessandra
                  <div>
                    <div class="h5"><br>
                      <br>
                      On 07/05/2013 10:25, Daniela Bauer wrote:<br>
                    </div>
                  </div>
                </div>
                <div>
                  <div class="h5">
                    <blockquote type="cite">
                      <div dir="ltr">
                        <div>
                          <div>
                            <div>Hi,<br>
                              <br>
                            </div>
                            Is there any input from Atlas on this ? <br>
                            <br>
                          </div>
                          Cheers,<br>
                        </div>
                        Daniela<br>
                      </div>
                      <div class="gmail_extra"><br>
                        <br>
                        <div class="gmail_quote">On 6 May 2013 09:54,
                          Rodney Walker <span dir="ltr">&lt;<a
                              moz-do-not-send="true"
                              href="mailto:[log in to unmask]"
                              target="_blank">[log in to unmask]</a>&gt;</span>
                          wrote:<br>
                          <blockquote class="gmail_quote"
                            style="margin:0 0 0 .8ex;border-left:1px
                            #ccc solid;padding-left:1ex">
                            <div dir="ltr">
                              <div>
                                <div>Hi,<br>
                                </div>
                                I know the LRZ fix, but it is not nice.
                                I will add something useful to the
                                ticket.<br>
                                <br>
                                Cheers,<br>
                              </div>
                              Rod.<br>
                              <div><span></span>
                                <div><span><br>
                                  </span></div>
                              </div>
                            </div>
                            <div class="gmail_extra">
                              <div>
                                <div><br>
                                  <br>
                                  <div class="gmail_quote">On 6 May 2013
                                    10:23, Paco Bernabe <span dir="ltr">&lt;<a
                                        moz-do-not-send="true"
                                        href="mailto:[log in to unmask]"
                                        target="_blank">[log in to unmask]</a>&gt;</span>
                                    wrote:<br>
                                    <blockquote class="gmail_quote"
                                      style="margin:0 0 0
                                      .8ex;border-left:1px #ccc
                                      solid;padding-left:1ex">
                                      <div text="#000000"
                                        bgcolor="#FFFFFF"> Hi everybody,<br>
                                        <br>
                                        A GGUS ticket (<a
                                          moz-do-not-send="true"
                                          href="https://ggus.eu/ws/ticket_info.php?ticket=93242"
                                          target="_blank">https://ggus.eu/ws/ticket_info.php?ticket=93242</a>)
                                        to our site (SARA-MATRIX)
                                        regarding a known issue about
                                        atlas analysis jobs landing on
                                        SL6 WNs described in the wiki (<a
                                          moz-do-not-send="true"
href="https://twiki.cern.ch/twiki/bin/view/Atlas/RPMCompatSLC6#compiling_packages_against_SLC5"
                                          target="_blank">https://twiki.cern.ch/twiki/bin/view/Atlas/RPMCompatSLC6#compiling_packages_against_SLC5</a>).



                                        When upgrading our WNs to <b>CentOS6</b>,
                                        we read the whole wiki and
                                        performed the necessary actions
                                        to avoid issues, but this code
                                        we cannot change (Meaning that
                                        it is not a site
                                        responsibility): <br>
                                        <pre>	# for building slc5 binaries on slc6 host
	macro CppSpecificFlags "" \
	    ATLAS&amp;host-slc6&amp;target-slc5 " -D__USE_XOPEN2K8 "</pre>
                                        <br>
                                        So it looks that the macro
                                        doesn't get defined for <b>CentOS6</b>
                                        sites, but only slc6 sites (At
                                        least at our site doesn't). Now,
                                        the user that opened the ticket
                                        sent us a link to another GGUS
                                        ticket (<a
                                          moz-do-not-send="true"
                                          href="https://ggus.eu/ws/ticket_info.php?ticket=93271"
                                          target="_blank">https://ggus.eu/ws/ticket_info.php?ticket=93271</a>)
                                        of a site (LRZ-LMU) that had the
                                        same problem and a workaround
                                        was found; this site has
                                        installed SLES11 SP2 which is
                                        compatible to SL6. If anybody of
                                        LRZ-LMU gets this email, could
                                        you let us know about that
                                        workaround? Is there any other
                                        site that support ATLAS and with
                                        SL6 (Or compatible) that has
                                        also this issue? Please, let us
                                        know.<span><font color="#888888"><br>
                                            <br>
                                            <div>-- <br>
                                              <p
                                                style="margin-bottom:0in">Met


                                                vriendelijke groeten /
                                                Best regards,</p>
                                              <p
                                                style="margin-bottom:0in">
                                                <b>Paco Bernabe</b></p>
                                              <p
                                                style="margin-bottom:0in">|
                                                Systemsprogrammer |
                                                SURFsara | Science Park
                                                140 | 1098XG Amsterdam |
                                                T <a
                                                  moz-do-not-send="true"
href="tel:%2B31%20610%20961%20785" value="+31610961785" target="_blank">+31


                                                  610 961 785</a> | <a
                                                  moz-do-not-send="true"
href="mailto:[log in to unmask]" target="_blank">[log in to unmask]</a> | <a
                                                  moz-do-not-send="true"
href="http://www.surfsara.nl" target="_blank">www.surfsara.nl</a> |</p>
                                              <p
                                                style="margin-bottom:0in"><img
                                                  moz-do-not-send="true"
                                                  alt=""
                                                  name="13e7e534eb5b7515_13e790d00a8ac6c2_13e78f023004808f_graphics1"
                                                  align="left"
                                                  border="0" height="32"
                                                  width="85"><br>
                                              </p>
                                              <p
                                                style="margin-bottom:0in"><br>
                                              </p>
                                            </div>
                                          </font></span></div>
                                    </blockquote>
                                  </div>
                                  <br>
                                  <br clear="all">
                                  <br>
                                </div>
                              </div>
                              <span><font color="#888888">-- <br>
                                  Tel. <a moz-do-not-send="true"
                                    href="tel:%2B49%2089%20289%2014152"
                                    value="+498928914152"
                                    target="_blank">+49 89 289 14152</a>
                                </font></span></div>
                          </blockquote>
                        </div>
                        <br>
                        <br clear="all">
                        <br>
                        -- <br>
                        <div dir="ltr">Sent from the pit of despair<br>
                          <br>
-----------------------------------------------------------<br>
                          <a moz-do-not-send="true"
                            href="mailto:[log in to unmask]"
                            target="_blank">[log in to unmask]</a><br>
                          HEP Group/Physics Dep<br>
                          Imperial College<br>
                          London, SW7 2BW<br>
                          Tel: <a moz-do-not-send="true"
                            href="tel:%2B44-%280%2920-75947810"
                            value="+442075947810" target="_blank">+44-(0)20-75947810</a><br>
                          <a moz-do-not-send="true"
                            href="http://www.hep.ph.ic.ac.uk/%7Edbauer/"
                            target="_blank">http://www.hep.ph.ic.ac.uk/~dbauer/</a></div>
                      </div>
                    </blockquote>
                    <br>
                    <br>
                  </div>
                </div>
                <span class="HOEnZb"><font color="#888888">
                    <pre cols="72">-- 
Facts aren't facts if they come from the wrong people. (Paul Krugman)
</pre>
                  </font></span></div>
            </blockquote>
          </div>
          <br>
          <br clear="all">
          <br>
          -- <br>
          <div dir="ltr">Sent from the pit of despair<br>
            <br>
            -----------------------------------------------------------<br>
            <a moz-do-not-send="true"
              href="mailto:[log in to unmask]" target="_blank">[log in to unmask]</a><br>
            HEP Group/Physics Dep<br>
            Imperial College<br>
            London, SW7 2BW<br>
            Tel: +44-(0)20-75947810<br>
            <a moz-do-not-send="true"
              href="http://www.hep.ph.ic.ac.uk/%7Edbauer/"
              target="_blank">http://www.hep.ph.ic.ac.uk/~dbauer/</a></div>
        </div>
      </blockquote>
      <br>
      <br>
      <pre class="moz-signature" cols="72">-- 
Facts aren't facts if they come from the wrong people. (Paul Krugman)
</pre>
    </blockquote>
    <br>
    <br>
    <pre class="moz-signature" cols="72">-- 
Facts aren't facts if they come from the wrong people. (Paul Krugman)
</pre>
  </body>
</html>

--------------040203090800010600000300--
=========================================================================
Date:         Tue, 7 May 2013 10:50:30 +0100
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Daniela Bauer <[log in to unmask]>
Subject:      Re: Atlas and SL6 WNs issue
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary=bcaec50164a19a63a604dc1dc0ce
Message-ID:  <[log in to unmask]>

--bcaec50164a19a63a604dc1dc0ce
Content-Type: text/plain; charset=ISO-8859-1

But anything outside cern doesn't count, right ? So we are all fine fine,
just fine :-D



On 7 May 2013 10:49, Alessandra Forti <[log in to unmask]> wrote:

>  PS if only slc5 and slc6 are foreseen SL(-C) sites might have a similar
> problem.
>
> cheers
> alessandra
>
>
>  On 07/05/2013 10:48, Alessandra Forti wrote:
>
> Well Atlas is aware of the issue I don't know when the fix will be
> available.
>
> From the Rod entry it seems cmt reads the architecture from some OS file
> and the CentOS architecture is not foreseen. Either the CeontOS sites will
> have to hack it and replace the architecture or atlas has to add a CentOS
> arch. The sw librarian is CC'd in the ticket and now also here.
>
> cheers
> alessandra
>
>
> On 07/05/2013 10:36, Daniela Bauer wrote:
>
>   Hi Alessandra,
>
> I was more hoping for a 'Atlas is aware of the issue, a fix is
> forthcoming' as it seems to suggest an Atlas specific problem.
>
>  The ticket is almost incomprehensible (can someone fix those html junk
> characters ?)
>
>  Cheers,
>  Daniela
>
>
>
>
>
> On 7 May 2013 10:29, Alessandra Forti <[log in to unmask]> wrote:
>
>>  Hi Daniela,
>>
>> Rod updated the ticket with a question yesterday. You might want to
>> subscribe the ticket.
>>
>> cheers
>> alessandra
>>
>>
>> On 07/05/2013 10:25, Daniela Bauer wrote:
>>
>>   Hi,
>>
>>  Is there any input from Atlas on this ?
>>
>>  Cheers,
>>  Daniela
>>
>>
>> On 6 May 2013 09:54, Rodney Walker <[log in to unmask]>wrote:
>>
>>>  Hi,
>>>  I know the LRZ fix, but it is not nice. I will add something useful to
>>> the ticket.
>>>
>>> Cheers,
>>>  Rod.
>>>
>>>
>>>
>>> On 6 May 2013 10:23, Paco Bernabe <[log in to unmask]> wrote:
>>>
>>>>  Hi everybody,
>>>>
>>>> A GGUS ticket (https://ggus.eu/ws/ticket_info.php?ticket=93242) to our
>>>> site (SARA-MATRIX) regarding a known issue about atlas analysis jobs
>>>> landing on SL6 WNs described in the wiki (
>>>> https://twiki.cern.ch/twiki/bin/view/Atlas/RPMCompatSLC6#compiling_packages_against_SLC5).
>>>> When upgrading our WNs to *CentOS6*, we read the whole wiki and
>>>> performed the necessary actions to avoid issues, but this code we cannot
>>>> change (Meaning that it is not a site responsibility):
>>>>
>>>> 	# for building slc5 binaries on slc6 host
>>>> 	macro CppSpecificFlags "" \
>>>> 	    ATLAS&host-slc6&target-slc5 " -D__USE_XOPEN2K8 "
>>>>
>>>>
>>>> So it looks that the macro doesn't get defined for *CentOS6* sites,
>>>> but only slc6 sites (At least at our site doesn't). Now, the user that
>>>> opened the ticket sent us a link to another GGUS ticket (
>>>> https://ggus.eu/ws/ticket_info.php?ticket=93271) of a site (LRZ-LMU)
>>>> that had the same problem and a workaround was found; this site has
>>>> installed SLES11 SP2 which is compatible to SL6. If anybody of LRZ-LMU gets
>>>> this email, could you let us know about that workaround? Is there any other
>>>> site that support ATLAS and with SL6 (Or compatible) that has also this
>>>> issue? Please, let us know.
>>>>
>>>> --
>>>>
>>>> Met vriendelijke groeten / Best regards,
>>>>
>>>> *Paco Bernabe*
>>>>
>>>> | Systemsprogrammer | SURFsara | Science Park 140 | 1098XG Amsterdam |
>>>> T +31 610 961 785 <%2B31%20610%20961%20785> | [log in to unmask] |
>>>> www.surfsara.nl |
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>  --
>>> Tel. +49 89 289 14152
>>>
>>
>>
>>
>> --
>> Sent from the pit of despair
>>
>> -----------------------------------------------------------
>> [log in to unmask]
>> HEP Group/Physics Dep
>> Imperial College
>> London, SW7 2BW
>> Tel: +44-(0)20-75947810
>> http://www.hep.ph.ic.ac.uk/~dbauer/
>>
>>
>>
>>   --
>> Facts aren't facts if they come from the wrong people. (Paul Krugman)
>>
>>
>
>
> --
> Sent from the pit of despair
>
> -----------------------------------------------------------
> [log in to unmask]
> HEP Group/Physics Dep
> Imperial College
> London, SW7 2BW
> Tel: +44-(0)20-75947810
> http://www.hep.ph.ic.ac.uk/~dbauer/
>
>
>
> --
> Facts aren't facts if they come from the wrong people. (Paul Krugman)
>
>
>
> --
> Facts aren't facts if they come from the wrong people. (Paul Krugman)
>
>


-- 
Sent from the pit of despair

-----------------------------------------------------------
[log in to unmask]
HEP Group/Physics Dep
Imperial College
London, SW7 2BW
Tel: +44-(0)20-75947810
http://www.hep.ph.ic.ac.uk/~dbauer/

--bcaec50164a19a63a604dc1dc0ce
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">But anything outside cern doesn&#39;t count, right ? So we=
 are all fine fine, just fine :-D<br><br></div><div class=3D"gmail_extra"><=
br><br><div class=3D"gmail_quote">On 7 May 2013 10:49, Alessandra Forti <sp=
an dir=3D"ltr">&lt;<a href=3D"mailto:[log in to unmask]" target=3D"_b=
lank">[log in to unmask]</a>&gt;</span> wrote:<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
 =20
   =20
 =20
  <div bgcolor=3D"#FFFFFF" text=3D"#000000">
    <div>PS if only slc5 and slc6 are foreseen
      SL(-C) sites might have a similar problem.<br>
      <br>
      cheers<span class=3D"HOEnZb"><font color=3D"#888888"><br>
      alessandra</font></span><div><div class=3D"h5"><br>
      <br>
      =A0On 07/05/2013 10:48, Alessandra Forti wrote:<br>
    </div></div></div><div><div class=3D"h5">
    <blockquote type=3D"cite">
     =20
      <div>Well Atlas is aware of the issue I
        don&#39;t know when the fix will be available.<br>
        <br>
        From the Rod entry it seems cmt reads the architecture from some
        OS file and the CentOS architecture is not foreseen. Either the
        CeontOS sites will have to hack it and replace the architecture
        or atlas has to add a CentOS arch. The sw librarian is CC&#39;d in
        the ticket and now also here. <br>
        <br>
        cheers<br>
        alessandra<br>
        <br>
        <br>
        On 07/05/2013 10:36, Daniela Bauer wrote:<br>
      </div>
      <blockquote type=3D"cite">
        <div dir=3D"ltr">
          <div>
            <div>
              <div>Hi Alessandra,<br>
                <br>
                I was more hoping for a &#39;Atlas is aware of the issue, a
                fix is forthcoming&#39; as it seems to suggest an Atlas
                specific problem.<br>
                <br>
              </div>
              The ticket is almost incomprehensible (can someone fix
              those html junk characters ?)<br>
              <br>
            </div>
            Cheers,<br>
          </div>
          Daniela<br>
          <div>
            <div>
              <div>
                <div><br>
                  <br>
                  <br>
                </div>
              </div>
            </div>
          </div>
        </div>
        <div class=3D"gmail_extra"><br>
          <br>
          <div class=3D"gmail_quote">On 7 May 2013 10:29, Alessandra Forti
            <span dir=3D"ltr">&lt;<a href=3D"mailto:[log in to unmask]
h" target=3D"_blank">[log in to unmask]</a>&gt;</span>
            wrote:<br>
            <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;bo=
rder-left:1px #ccc solid;padding-left:1ex">
              <div bgcolor=3D"#FFFFFF" text=3D"#000000">
                <div>Hi Daniela,<br>
                  <br>
                  Rod updated the ticket with a question yesterday. You
                  might want to subscribe the ticket.<br>
                  <br>
                  cheers<br>
                  alessandra
                  <div>
                    <div><br>
                      <br>
                      On 07/05/2013 10:25, Daniela Bauer wrote:<br>
                    </div>
                  </div>
                </div>
                <div>
                  <div>
                    <blockquote type=3D"cite">
                      <div dir=3D"ltr">
                        <div>
                          <div>
                            <div>Hi,<br>
                              <br>
                            </div>
                            Is there any input from Atlas on this ? <br>
                            <br>
                          </div>
                          Cheers,<br>
                        </div>
                        Daniela<br>
                      </div>
                      <div class=3D"gmail_extra"><br>
                        <br>
                        <div class=3D"gmail_quote">On 6 May 2013 09:54,
                          Rodney Walker <span dir=3D"ltr">&lt;<a href=3D"ma=
ilto:[log in to unmask]" target=3D"_blank">rodney.walker@=
physik.uni-muenchen.de</a>&gt;</span>
                          wrote:<br>
                          <blockquote class=3D"gmail_quote" style=3D"margin=
:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                            <div dir=3D"ltr">
                              <div>
                                <div>Hi,<br>
                                </div>
                                I know the LRZ fix, but it is not nice.
                                I will add something useful to the
                                ticket.<br>
                                <br>
                                Cheers,<br>
                              </div>
                              Rod.<br>
                              <div><span></span>
                                <div><span><br>
                                  </span></div>
                              </div>
                            </div>
                            <div class=3D"gmail_extra">
                              <div>
                                <div><br>
                                  <br>
                                  <div class=3D"gmail_quote">On 6 May 2013
                                    10:23, Paco Bernabe <span dir=3D"ltr">&=
lt;<a href=3D"mailto:[log in to unmask]" target=3D"_blank">[log in to unmask]</=
a>&gt;</span>
                                    wrote:<br>
                                    <blockquote class=3D"gmail_quote" style=
=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                      <div text=3D"#000000" bgcolor=3D"#FFF=
FFF"> Hi everybody,<br>
                                        <br>
                                        A GGUS ticket (<a href=3D"https://g=
gus.eu/ws/ticket_info.php?ticket=3D93242" target=3D"_blank">https://ggus.eu=
/ws/ticket_info.php?ticket=3D93242</a>)
                                        to our site (SARA-MATRIX)
                                        regarding a known issue about
                                        atlas analysis jobs landing on
                                        SL6 WNs described in the wiki (<a h=
ref=3D"https://twiki.cern.ch/twiki/bin/view/Atlas/RPMCompatSLC6#compiling_p=
ackages_against_SLC5" target=3D"_blank">https://twiki.cern.ch/twiki/bin/vie=
w/Atlas/RPMCompatSLC6#compiling_packages_against_SLC5</a>).



                                        When upgrading our WNs to <b>CentOS=
6</b>,
                                        we read the whole wiki and
                                        performed the necessary actions
                                        to avoid issues, but this code
                                        we cannot change (Meaning that
                                        it is not a site
                                        responsibility): <br>
                                        <pre>	# for building slc5 binaries =
on slc6 host
	macro CppSpecificFlags &quot;&quot; \
	    ATLAS&amp;host-slc6&amp;target-slc5 &quot; -D__USE_XOPEN2K8 &quot;</pr=
e>
                                        <br>
                                        So it looks that the macro
                                        doesn&#39;t get defined for <b>Cent=
OS6</b>
                                        sites, but only slc6 sites (At
                                        least at our site doesn&#39;t). Now=
,
                                        the user that opened the ticket
                                        sent us a link to another GGUS
                                        ticket (<a href=3D"https://ggus.eu/=
ws/ticket_info.php?ticket=3D93271" target=3D"_blank">https://ggus.eu/ws/tic=
ket_info.php?ticket=3D93271</a>)
                                        of a site (LRZ-LMU) that had the
                                        same problem and a workaround
                                        was found; this site has
                                        installed SLES11 SP2 which is
                                        compatible to SL6. If anybody of
                                        LRZ-LMU gets this email, could
                                        you let us know about that
                                        workaround? Is there any other
                                        site that support ATLAS and with
                                        SL6 (Or compatible) that has
                                        also this issue? Please, let us
                                        know.<span><font color=3D"#888888">=
<br>
                                            <br>
                                            <div>-- <br>
                                              <p style=3D"margin-bottom:0in=
">Met


                                                vriendelijke groeten /
                                                Best regards,</p>
                                              <p style=3D"margin-bottom:0in=
">
                                                <b>Paco Bernabe</b></p>
                                              <p style=3D"margin-bottom:0in=
">|
                                                Systemsprogrammer |
                                                SURFsara | Science Park
                                                140 | 1098XG Amsterdam |
                                                T <a href=3D"tel:%2B31%2061=
0%20961%20785" value=3D"+31610961785" target=3D"_blank">+31


                                                  610 961 785</a> | <a href=
=3D"mailto:[log in to unmask]" target=3D"_blank">[log in to unmask]</a> | <a hr=
ef=3D"http://www.surfsara.nl" target=3D"_blank">www.surfsara.nl</a> |</p>
                                              <p style=3D"margin-bottom:0in=
"><img alt=3D"" name=3D"13e7e659981b2010_13e7e534eb5b7515_13e790d00a8ac6c2_=
13e78f023004808f_graphics1" border=3D"0" height=3D"32" width=3D"85" align=
=3D"left"><br>


                                              </p>
                                              <p style=3D"margin-bottom:0in=
"><br>
                                              </p>
                                            </div>
                                          </font></span></div>
                                    </blockquote>
                                  </div>
                                  <br>
                                  <br clear=3D"all">
                                  <br>
                                </div>
                              </div>
                              <span><font color=3D"#888888">-- <br>
                                  Tel. <a href=3D"tel:%2B49%2089%20289%2014=
152" value=3D"+498928914152" target=3D"_blank">+49 89 289 14152</a>
                                </font></span></div>
                          </blockquote>
                        </div>
                        <br>
                        <br clear=3D"all">
                        <br>
                        -- <br>
                        <div dir=3D"ltr">Sent from the pit of despair<br>
                          <br>
-----------------------------------------------------------<br>
                          <a href=3D"mailto:[log in to unmask]" t=
arget=3D"_blank">[log in to unmask]</a><br>
                          HEP Group/Physics Dep<br>
                          Imperial College<br>
                          London, SW7 2BW<br>
                          Tel: <a href=3D"tel:%2B44-%280%2920-75947810" val=
ue=3D"+442075947810" target=3D"_blank">+44-(0)20-75947810</a><br>
                          <a href=3D"http://www.hep.ph.ic.ac.uk/%7Edbauer/"=
 target=3D"_blank">http://www.hep.ph.ic.ac.uk/~dbauer/</a></div>
                      </div>
                    </blockquote>
                    <br>
                    <br>
                  </div>
                </div>
                <span><font color=3D"#888888">
                    <pre cols=3D"72">--=20
Facts aren&#39;t facts if they come from the wrong people. (Paul Krugman)
</pre>
                  </font></span></div>
            </blockquote>
          </div>
          <br>
          <br clear=3D"all">
          <br>
          -- <br>
          <div dir=3D"ltr">Sent from the pit of despair<br>
            <br>
            -----------------------------------------------------------<br>
            <a href=3D"mailto:[log in to unmask]" target=3D"_blan=
k">[log in to unmask]</a><br>
            HEP Group/Physics Dep<br>
            Imperial College<br>
            London, SW7 2BW<br>
            Tel: <a href=3D"tel:%2B44-%280%2920-75947810" value=3D"+4420759=
47810" target=3D"_blank">+44-(0)20-75947810</a><br>
            <a href=3D"http://www.hep.ph.ic.ac.uk/%7Edbauer/" target=3D"_bl=
ank">http://www.hep.ph.ic.ac.uk/~dbauer/</a></div>
        </div>
      </blockquote>
      <br>
      <br>
      <pre cols=3D"72">--=20
Facts aren&#39;t facts if they come from the wrong people. (Paul Krugman)
</pre>
    </blockquote>
    <br>
    <br>
    <pre cols=3D"72">--=20
Facts aren&#39;t facts if they come from the wrong people. (Paul Krugman)
</pre>
  </div></div></div>

</blockquote></div><br><br clear=3D"all"><br>-- <br><div dir=3D"ltr">Sent f=
rom the pit of despair<br><br>---------------------------------------------=
--------------<br><a href=3D"mailto:[log in to unmask]" target=3D=
"_blank">[log in to unmask]</a><br>

HEP Group/Physics Dep<br>Imperial College<br>London, SW7 2BW<br>Tel: +44-(0=
)20-75947810<br><a href=3D"http://www.hep.ph.ic.ac.uk/~dbauer/" target=3D"_=
blank">http://www.hep.ph.ic.ac.uk/~dbauer/</a></div>
</div>

--bcaec50164a19a63a604dc1dc0ce--
=========================================================================
Date:         Wed, 8 May 2013 13:22:58 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Pavel Weber <[log in to unmask]>
Subject:      11th International GridKa School: Big Data,
              Clouds and Grids  - Registration open
MIME-Version: 1.0
Content-Type: multipart/alternative;
              boundary="------------090209060906090703010400"
Message-ID:  <[log in to unmask]>

--------------090209060906090703010400
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed
Content-Transfer-Encoding: 8bit

Dear all,

The Karlsruhe Institute of Technology (KIT) would like to announce the 
11th international summer school "GridKa School 2013: Big Data, Clouds 
and Grids". It is one of the leading summer schools for advanced 
computing techniques in Europe. The school provides a forum for 
scientists and technology leaders, experts and novices to facilitate 
knowledge sharing and information exchange. The target audience are 
different groups like grid and cloud newbies, advanced users as well as 
site administrators, graduate and PhD students in computing and science 
disciplines.

GridKa School is hosted by Steinbuch Centre for Computing (SCC) of 
Karlsruhe Institute of Technology (KIT). It is organized by KIT and the 
HGF Alliance "Physics at the Terascale". This year the 11th school will 
be held in Karlsruhe, Germany from August 26 to 30, 2013.

The technical program of the school comprises plenary talks and hands on 
tutorials on the following topics:

  * Large scale data management
  * Multi-core computing
  * Cloud services and applications
  * Cloud installation and administration
  * Effective programming
  * GPU computing
  * Storage technologies
  * Grid middleware administration
  * Security and incident management


You could check the full agenda of the school following the link: 
https://indico.scc.kit.edu/indico/conferenceTimeTable.py?confId=26#all.detailed

Half of the school consists of expert talks, which cover the fundamental 
and theoretical aspects of the topics. While the other half consists of 
hands-on sessions and workshops, which give the participants the 
excellent chance to gain the practical experience of techniques and tools.

Security is a key aspect of any modern computer technology. The security 
track will remain one of the main topics of the school also this year. 
The team of security experts will present a broad program on security 
starting with introductory talk on Monday, which will cover the most 
common threats to computer security in the Grid and Cloud Computing 
environment, followed by the 3-day workshop. During the workshop the 
participants with help of a virtual environment simulating grid sites 
will learn how to react on security incidents and perform a forensic 
analysis after simulated security incidents in real time.

The program is completed by two social events ('tarte flambee' evening 
at the German WLCG Tier-1 center GridKa and the school dinner in the 
city center). Detailed information and registration form can be found at 
http://gridka-school.scc.kit.edu. Please distribute this announcement 
among your colleagues and friends.

Looking forward to welcoming you in Karlsruhe.

Pavel Weber
(on behalf of the local organizing team)

P.S. Apologies for multiple postings.

-- 
Karlsruhe Institute of Technology (KIT)
Steinbuch Centre for Computing (SCC)

Dr. Pavel Weber

Hermann-von-Helmholtz-Platz 1,
D-76344 Eggenstein-Leopoldshafen

Phone: +49 721 608-28621
email:[log in to unmask]
Web:www.kit.edu

KIT -- University of the State of Baden-W�rttemberg and National Research Center of the Helmholtz Association



--------------090209060906090703010400
Content-Type: text/html; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit

<html>
  <head>
    <meta http-equiv="content-type" content="text/html;
      charset=ISO-8859-1">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Dear
      all, <br>
      <br>
      The Karlsruhe Institute of Technology (KIT) would like to announce
      the 11th international summer school "GridKa School 2013: Big
      Data, Clouds and Grids". It is one of the leading summer schools
      for advanced computing techniques in Europe. The school provides a
      forum for scientists and technology leaders, experts and novices
      to facilitate knowledge sharing and information exchange. The
      target audience are different groups like grid and cloud newbies,
      advanced users as well as site administrators, graduate and PhD
      students in computing and science disciplines. <br>
    </p>
    <p class="MsoNormal"
      style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">GridKa
      School is hosted by Steinbuch Centre for Computing (SCC) of
      Karlsruhe Institute of Technology (KIT). It is organized by KIT
      and the HGF Alliance "Physics at the Terascale". This year the
      11th school will be held in Karlsruhe, Germany from August 26 to
      30, 2013. <br>
      <br>
      The technical program of the school comprises plenary talks and
      hands on tutorials on the following topics: <o:p></o:p></p>
    <ul type="disc">
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0
        level1 lfo3"> Large scale data management <o:p></o:p></li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0
        level1 lfo3"> Multi-core computing <br>
      </li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0
        level1 lfo3">Cloud services and applications</li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0
        level1 lfo3">Cloud installation and administration<br>
      </li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0
        level1 lfo3"> Effective programming</li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0
        level1 lfo3">GPU computing</li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0
        level1 lfo3"> Storage technologies</li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0
        level1 lfo3">Grid middleware administration</li>
      <li class="MsoNormal"
        style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0
        level1 lfo3"> <o:p></o:p>Security and incident management</li>
      <br>
      <br>
    </ul>
    <p>You could check the full agenda of the school following the link:
      <a class="moz-txt-link-freetext"
href="https://indico.scc.kit.edu/indico/conferenceTimeTable.py?confId=26#all.detailed">https://indico.scc.kit.edu/indico/conferenceTimeTable.py?confId=26#all.detailed</a></p>
    <p>Half of the school consists of expert talks, which cover the
      fundamental and theoretical aspects of the topics. While the other
      half consists of hands-on sessions and workshops, which give the
      participants the excellent chance to gain the practical experience
      of techniques and tools. <o:p></o:p></p>
    <p>Security is a key aspect of any modern computer technology. The
      security track will remain one of the main topics of the school
      also this year. The team of security experts will present a broad
      program on security starting with introductory talk on Monday,
      which will cover the most common threats to computer security in
      the Grid and Cloud Computing environment, followed by the 3-day
      workshop. During the workshop the participants with help of a
      virtual environment simulating grid sites will learn how to react
      on security incidents and perform a forensic analysis after
      simulated security incidents in real time. <o:p></o:p></p>
    <p>The program is completed by two social events ('tarte flambee'
      evening at the German WLCG Tier-1 center GridKa and the school
      dinner in the city center). Detailed information and registration
      form can be found at <a moz-do-not-send="true"
        href="http://gridka-school.scc.kit.edu">http://gridka-school.scc.kit.edu</a>.
      Please distribute this announcement among your colleagues and
      friends. <o:p> </o:p></p>
    Looking forward to welcoming you in Karlsruhe. <br>
    <br>
    Pavel Weber <br>
    (on behalf of the local organizing team) <br>
    <br>
    P.S. Apologies for multiple postings.<br>
    <br>
    <pre class="moz-signature" cols="72">-- 
Karlsruhe Institute of Technology (KIT)
Steinbuch Centre for Computing (SCC)

Dr. Pavel Weber

Hermann-von-Helmholtz-Platz 1,
D-76344 Eggenstein-Leopoldshafen

Phone: +49 721 608-28621
email: <a class="moz-txt-link-abbreviated" href="mailto:[log in to unmask]">[log in to unmask]</a>
Web: <a class="moz-txt-link-abbreviated" href="http://www.kit.edu">www.kit.edu</a>

KIT &#8211; University of the State of Baden-W&uuml;rttemberg and National Research Center of the Helmholtz Association</pre>
    <br>
  </body>
</html>

--------------090209060906090703010400--
=========================================================================
Date:         Wed, 8 May 2013 18:09:01 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Antonio Delgado Peris <[log in to unmask]>
Organization: CERN
Subject:      Re: Multiple Argus endpoints for glexec
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: text/plain; charset="ISO-8859-15"; format=flowed
Content-Transfer-Encoding: quoted-printable
Message-ID:  <[log in to unmask]>

Hi again,

> Never tried linux-HA so far, but it sounds like a nice occasion to do=20
> it  :-)=20

I tried this (msater/slave) and found it somewhat more complicated than=20
'trivial' :-)

Anyway, the setup is now working (apparently fine).

Now, since, as I said, this took a bit of effort, and I had already=20
produced some documentation for internal consumption, I decided to=20
expand this a bit and created a small guide about how to configure Argus=20
in HA master/slave on SL6 (in case this may be interesting to other people)=
.

Here it is:
    http://wwwae.ciemat.es/~delgadop/argus/

Hope it helps somebody :-)

Cheers,
     Antonio



El 04/12/2013 11:19 AM, Antonio Delgado Peris escribi=F3:
> Hi,
>
>>> Regarding the DNS round-robin approach, that should be fine for load=20
>>> balancing
>>> (if required) but it's not enough for fail-over, unless we assume=20
>>> that all
>>> clients using glexec retry on failure. I don't know if that's the=20
>>> case. If it
>>> was not, we would need an automatic DNS reconfiguration based on=20
>>> some test
>>> result. That's not always so easy to set...
>> If you cannot dynamically tweak DNS (common), then the obvious solution
>> would be an active-passive fail-over using heartbeat from linux-HA for
>> the Argus PEP service. That's available by default in EPEL for EL5=20
>> and EL6
>> and relatively easy to set up. I think for Argus it would likely be=20
>> best to
>> use active-passive fail-over, but maybe you can try=20
>> active-active+RRDNS as
>> well ;-) Active-active+RRDNS will only work for stateless services,=20
>> obviously.
>> And heartbeat is quite simple to set up.
>
> Never tried linux-HA so far, but it sounds like a nice occasion to do=20
> it  :-)
>
> Active-passive may be good enough for us, I think. As for=20
> active+active+RRDNS, we would be sharing gridmap-dir between the=20
> servers and for the rest I think the servers are really stateless.=20
> However, I guess there might be a race condition if two pilots ask for=20
> a new mapping at barely the same time and the two Argus servers end up=20
> adding different entries for the same FQAN... Unless we see the load=20
> on Argus becomes an evident problem, I think we'll stay on the safe=20
> side and try the active-passive setup.
>
> Thanks!
>
> Cheers,
>     Antonio
>
>
>>
>>     Cheers,
>>     DavidG.
>>
>>> Well, at least now we know what's the situation on the glexec client=20
>>> side.
>>> We'll think how to proceed from here.
>>>
>>> Thanks again.
>>>
>>> Cheers,
>>>     Antonio
>>>
>>>
>>>> Hi Antonio,
>>>>
>>>> the short answer is, it doesn't work and is a documentation problem.
>>>> I have just updated the two pages you are referring to. Note that you
>>>> referred to an outdated Nikhef wiki page, see the first line...
>>>>
>>>> We started implementing it, but this has never been completed as it
>>>> turned out that all existing use cases could be solved in a different
>>>> way (server side, e.g. round-robin DNS etc.) and that the argus pep-c
>>>> client library also does not have the support.
>>>>
>>>> Btw, we are all in different conferences at the moment and=20
>>>> therefore not
>>>> very communicative.
>>>>
>>>>       Cheers,
>>>>       Mischa
>>>>
>>>> On Wed, Apr 10, 2013 at 07:35:08PM +0200, Antonio Delgado Peris wrote:
>>>>> Hi all,
>>>>>
>>>>> Several documentation pieces state that glexec can be configured
>>>>> with several Argus endpoints. E.g. in
>>>>> https://twiki.cern.ch/twiki/bin/view/LCG/Site-info_configuration_vari=
ables#ARGUS,=20
>>>>>
>>>>>
>>>>> it says:
>>>>>
>>>>>       ARGUS_PEPD_ENDPOINTS     If glexec is a PEP client, define the
>>>>> PEPD endpoints with this variable. It is a whitespace separated list
>>>>> of URLs, e.g. https://argus1.example.com:8154/authz
>>>>>
>>>>>
>>>>> Also, in http://wiki.nikhef.nl/grid/Set_up_gLExec_for_Argus, one=20
>>>>> can read:
>>>>>
>>>>> ARGUS_PEPD_ENDPOINTS=3D"https://argus1.example.com:8154/authz
>>>>> https://argus2.example.com:8145/authz"
>>>>>
>>>>>       In this example the site has two service endpoints; the quotes
>>>>> are necessary as this is interpreted shell code. Multiple endpoints
>>>>> may be defined for scale; the pep-c plug-in will randomly choose one
>>>>> endpoint to talk to, and automatically fail-over to the others. (??
>>>>> Verify please)
>>>>>
>>>>>
>>>>> The "(?? Verify please)" part makes me quite suspicious. And, in
>>>>> fact, I have tried it and this round-robin/failover does not seem to
>>>>> work.
>>>>>
>>>>> Instead of using YAIM, I have takes its job and manually modified
>>>>> /etc/lcmaps/lcmaps-glexec.db and, in the "pepc" part, added a couple
>>>>> of lines like the following:
>>>>>
>>>>>             "--pep-daemon-endpoint-url=20
>>>>> https://gaergus.ciemat.es:8154/authz"
>>>>>             "--pep-daemon-endpoint-url
>>>>> https://gaergus02.ciemat.es:8154/authz"
>>>>>
>>>>>
>>>>> But it does not work. If both argus servers work fine, glexec will
>>>>> ask only one of them. If this one fails, glexec will not fail-over,
>>>>> it will just return an error. If I enable debug info, I can see that
>>>>> the second time "--pep-daemon-endpoint-url" is parsed, the client
>>>>> complains with:
>>>>>
>>>>>       2013-04-10 19:08:55 DEBUG: pep_setoption: PEP#0
>>>>> option_endpoint_url already set to
>>>>> 'https://gaergus02.ciemat.es:8154/authz', freeing...
>>>>>
>>>>> So... is this known? Has anyone out there achieved that glexec works
>>>>> with multiple Argus endpoints? If so, how?
>>>>>
>>>>>
>>>>> BTW, I am using:
>>>>>
>>>>> Client:
>>>>>       lcmaps-1.5.5-1.el6.x86_64
>>>>>       glexec-0.9.6-1.el6.x86_64
>>>>>       argus-pep-api-c-2.1.0-3.sl6.x86_64
>>>>>       lcmaps-plugins-c-pep-1.2.2-1.el6.x86_64
>>>>>
>>>>> Server:
>>>>>       argus-pap-1.5.1-1.el6.noarch
>>>>>       argus-pepcli-2.1.0-2.sl6.x86_64
>>>>>       emi-argus-1.5.0-1.sl6.x86_64
>>>>>       argus-pdp-pep-common-1.3.1-1.sl6.noarch
>>>>>       argus-pep-common-2.2.0-1.sl6.noarch
>>>>>       argus-pdp-1.5.1-2.sl6.noarch
>>>>>       argus-pep-api-c-2.1.0-3.sl6.x86_64
>>>>>       argus-pep-server-1.5.1-2.sl6.noarch
>>>>>
>>>>>
>>>>> Thanks.
>>>>>
>>>>> Cheers,
>>>>>      Antonio
>>
=========================================================================
Date:         Mon, 13 May 2013 07:52:19 +0300
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Felix Farcas <[log in to unmask]>
Subject:      unable to run jobs on emi3
MIME-Version: 1.0
Content-Type: multipart/signed; protocol="application/pkcs7-signature";
              micalg=sha1; boundary="------------ms010401030003030101030805"
Message-ID:  <[log in to unmask]>

This is a cryptographically signed message in MIME format.

--------------ms010401030003030101030805
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: quoted-printable

Hello

I have installed emi3 on cream and work-node.

When a jobs come to cream it stops in the waiting queue. It trys to run=20
but on the worknode I got the following message.

May 13 07:54:50 cn-wn03 pbs_mom: LOG_ERROR::req_cpyfile, Unable to copy=20
file=20
[log in to unmask]:/var/cream_sandbox/ifops/CN_Serban_Constantines=
cu_CN_383516_CN_serban_OU_Users_OU_Organic_Units_DC_cern_DC_ch_ifops_Role=
_NULL_Capability_NULL_ifops06/53/CREAM530146720/CREAM530146720_jobWrapper=
=2Esh=20
to CREAM530146720_jobWrapper.sh.19505.4776.1368181657
May 13 07:54:50 cn-wn03 pbs_mom: LOG_ERROR::req_cpyfile, #012#012Unable=20
to copy file=20
[log in to unmask]:/var/cream_sandbox/ifops/CN_Serban_Constantines=
cu_CN_383516_CN_serban_OU_Users_OU_Organic_Units_DC_cern_DC_ch_ifops_Role=
_NULL_Capability_NULL_ifops06/53/CREAM530146720/CREAM530146720_jobWrapper=
=2Esh=20
to CREAM530146720_jobWrapper.sh.19505.4776.1368181657#012*** error from=20
copy#012Permission denied=20
(publickey,gssapi-keyex,gssapi-with-mic,password,hostbased).#015#012***=20
end error output

Thank you

Felix

--=20
Dr. Ing. Farcas Felix
National Institute of Research and Development
of Isotopic and Molecular Technology,
IT - Department - Cluj-Napoca, Romania
yahoo id: felixfarcas
skype id: felix.farcas
mobile: +40-742-195323



--------------ms010401030003030101030805
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature

MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIE4TCC
BN0wggPFoAMCAQICAgMmMA0GCSqGSIb3DQEBBQUAMH0xEjAQBgoJkiaJk/IsZAEZEwJSTzEc
MBoGCgmSJomT8ixkARkTDFJvbWFuaWFuR1JJRDENMAsGA1UEChMEUk9TQTEgMB4GA1UECxMX
Q2VydGlmaWNhdGlvbiBBdXRob3JpdHkxGDAWBgNVBAMTD1JvbWFuaWFuR1JJRCBDQTAeFw0x
MjEwMjkxMzU4MzhaFw0xMzEwMjkxMzU4MzhaMFgxEjAQBgoJkiaJk/IsZAEZFgJSTzEcMBoG
CgmSJomT8ixkARkWDFJvbWFuaWFuR1JJRDENMAsGA1UEChMESVRJTTEVMBMGA1UEAxMMRmVs
aXggRmFyY2FzMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAvqyqR+N6ICid7NWE
iRkX4xiZWcIyCQk+7WKv5aFxbNliVNzMj0AgFl8sReFQozJA5BLNn62ib416k6SeDntwkVfC
YgJ6XvvgosmheG09KHVcldmLa4l6wMZrZtXD11kZk/bJKpb8HaOWAacdhiiJfkec0D37vlVw
iNe9nnrKkILhbsNI3OgUz0lgBZ6eR6E9uwOWUdwwLPxP67Il2ri2BSzdAm2v/ZYcAqz+ngoy
8lQNs7q32Ff4yMUDPteF6drjb6tbaN9X4fPXLjPfMQCIDC3k5Yns1PRFeNpxKpeqQUfLX/N7
QMA4sIOdj5PzsyTkxeXROkjDLLPqzuGgmU3AoQIDAQABo4IBijCCAYYwDAYDVR0TAQH/BAIw
ADARBglghkgBhvhCAQEEBAMCBaAwDgYDVR0PAQH/BAQDAgSwMB0GA1UdJQQWMBQGCCsGAQUF
BwMCBggrBgEFBQcDBDAdBgNVHQ4EFgQUKpghPosFJNPBR3gyoCI68IbfhuIwHwYDVR0jBBgw
FoAUfxDNXG6symsSC+Cph4gRlbwwF2gwYwYDVR0gBFwwWjBYBg4rBgEEAYHTXwkBAgEBBDBG
MEQGCCsGAQUFBwIBFjhodHRwOi8vd3d3LnJvbWFuaWFuZ3JpZC5yby9kb2NzL1JvbWFuaWFu
R1JJRF9DQV92MS40LnBkZjA6BgNVHR8EMzAxMC+gLaArhilodHRwOi8vd3d3LnJvbWFuaWFu
Z3JpZC5yby9jcmwvY3JsLXYyLmRlcjA2BgNVHRIELzAtgQ9ncmlkLWNhQHJvc2Eucm+GGmh0
dHA6Ly93d3cucm9tYW5pYW5ncmlkLnJvMBsGA1UdEQQUMBKBEGZlbGl4QGl0aW0tY2oucm8w
DQYJKoZIhvcNAQEFBQADggEBAD47kwJhx/ISdqRIErw2aJUuphazL9BZGXyYhoxN1miPxiat
i0qd8ir4X4GJGvfXjjppndFyBwfYZW1AOApGkAOGgZKQffp8wjay91ENd/JRNNli3UhpESA7
za2AQCu7DU5/BZ5kDgjUDpDVdQ2QIECuO8J+CUxI9ziIaOROo/M5p72F6tIcKF2Ps9bplbCL
FcM+B7rZ8JWOhTHzydpaLnseKaNkNkxop9XGKR25xztf6JENU3gu0fvpTTsVkI1fI15Ifsl2
mzucyJccs9mNt5MfNY0fULWi1t6Lb6+9CS2yVm+sGLPQK1xh2Jmi4O2tYl0XmffSa0Zavn/6
kbxcwDgxggJ5MIICdQIBATCBgzB9MRIwEAYKCZImiZPyLGQBGRMCUk8xHDAaBgoJkiaJk/Is
ZAEZEwxSb21hbmlhbkdSSUQxDTALBgNVBAoTBFJPU0ExIDAeBgNVBAsTF0NlcnRpZmljYXRp
b24gQXV0aG9yaXR5MRgwFgYDVQQDEw9Sb21hbmlhbkdSSUQgQ0ECAgMmMAkGBSsOAwIaBQCg
gcswGAYJKoZIhvcNAQkDMQsGCSqGSIb3DQEHATAcBgkqhkiG9w0BCQUxDxcNMTMwNTEzMDQ1
MjE5WjAjBgkqhkiG9w0BCQQxFgQU9LDVgiusqmbdYIhrf3KJ/kSP4EswbAYJKoZIhvcNAQkP
MV8wXTALBglghkgBZQMEASowCwYJYIZIAWUDBAECMAoGCCqGSIb3DQMHMA4GCCqGSIb3DQMC
AgIAgDANBggqhkiG9w0DAgIBQDAHBgUrDgMCBzANBggqhkiG9w0DAgIBKDANBgkqhkiG9w0B
AQEFAASCAQBx/chrqrk4hY78IIm2UFgkZIweMd6RneMSO7/p8qloB+K6NiEF2zUPe7UX/qEr
bcG+Kymje+7Y0hexscNtdQ/krW+hQYcQUlWXKxSmTKcVL+x5hC4nAxD2yiYPnvjngj6Hyks2
xVd4PkQhqY3O/pGapzpRxR92HXQFuKZ+ccCOC79glgkhXcy5bgWAknkNvCIfhhuiitGQ3p7S
VqI0LRpMzD0fR00vLq6j5aDcdwWMRzZpi+cDZ9KoXSpuoDl17+clfYVH0iAPyyDRLGHIeveE
8dkVWiwKPUE1ErWL/L4qaCGn/hRLZuSotocNr1br1YTeye03eHr9iy+bbKhrEl6YAAAAAAAA

--------------ms010401030003030101030805--
=========================================================================
Date:         Mon, 13 May 2013 08:37:49 +0300
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Felix Farcas <[log in to unmask]>
Subject:      Re: unable to run jobs on emi3
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: multipart/signed; protocol="application/pkcs7-signature";
              micalg=sha1; boundary="------------ms030703060208060304020608"
Message-ID:  <[log in to unmask]>

This is a cryptographically signed message in MIME format.

--------------ms030703060208060304020608
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: quoted-printable

Now the jobs are running but the are staying in run mode forever.

Do i have the possibility to look where the job is hanging?

Thank you

Felix

On 5/13/2013 7:52 AM, Felix Farcas wrote:
> Hello
>
> I have installed emi3 on cream and work-node.
>
> When a jobs come to cream it stops in the waiting queue. It trys to=20
> run but on the worknode I got the following message.
>
> May 13 07:54:50 cn-wn03 pbs_mom: LOG_ERROR::req_cpyfile, Unable to=20
> copy file=20
> [log in to unmask]:/var/cream_sandbox/ifops/CN_Serban_Constantin=
escu_CN_383516_CN_serban_OU_Users_OU_Organic_Units_DC_cern_DC_ch_ifops_Ro=
le_NULL_Capability_NULL_ifops06/53/CREAM530146720/CREAM530146720_jobWrapp=
er.sh=20
> to CREAM530146720_jobWrapper.sh.19505.4776.1368181657
> May 13 07:54:50 cn-wn03 pbs_mom: LOG_ERROR::req_cpyfile,=20
> #012#012Unable to copy file=20
> [log in to unmask]:/var/cream_sandbox/ifops/CN_Serban_Constantin=
escu_CN_383516_CN_serban_OU_Users_OU_Organic_Units_DC_cern_DC_ch_ifops_Ro=
le_NULL_Capability_NULL_ifops06/53/CREAM530146720/CREAM530146720_jobWrapp=
er.sh=20
> to CREAM530146720_jobWrapper.sh.19505.4776.1368181657#012*** error=20
> from copy#012Permission denied=20
> (publickey,gssapi-keyex,gssapi-with-mic,password,hostbased).#015#012***=
 end=20
> error output
>
> Thank you
>
> Felix
>


--=20
Dr. Ing. Farcas Felix
National Institute of Research and Development
of Isotopic and Molecular Technology,
IT - Department - Cluj-Napoca, Romania
yahoo id: felixfarcas
skype id: felix.farcas
mobile: +40-742-195323



--------------ms030703060208060304020608
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature

MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIE4TCC
BN0wggPFoAMCAQICAgMmMA0GCSqGSIb3DQEBBQUAMH0xEjAQBgoJkiaJk/IsZAEZEwJSTzEc
MBoGCgmSJomT8ixkARkTDFJvbWFuaWFuR1JJRDENMAsGA1UEChMEUk9TQTEgMB4GA1UECxMX
Q2VydGlmaWNhdGlvbiBBdXRob3JpdHkxGDAWBgNVBAMTD1JvbWFuaWFuR1JJRCBDQTAeFw0x
MjEwMjkxMzU4MzhaFw0xMzEwMjkxMzU4MzhaMFgxEjAQBgoJkiaJk/IsZAEZFgJSTzEcMBoG
CgmSJomT8ixkARkWDFJvbWFuaWFuR1JJRDENMAsGA1UEChMESVRJTTEVMBMGA1UEAxMMRmVs
aXggRmFyY2FzMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAvqyqR+N6ICid7NWE
iRkX4xiZWcIyCQk+7WKv5aFxbNliVNzMj0AgFl8sReFQozJA5BLNn62ib416k6SeDntwkVfC
YgJ6XvvgosmheG09KHVcldmLa4l6wMZrZtXD11kZk/bJKpb8HaOWAacdhiiJfkec0D37vlVw
iNe9nnrKkILhbsNI3OgUz0lgBZ6eR6E9uwOWUdwwLPxP67Il2ri2BSzdAm2v/ZYcAqz+ngoy
8lQNs7q32Ff4yMUDPteF6drjb6tbaN9X4fPXLjPfMQCIDC3k5Yns1PRFeNpxKpeqQUfLX/N7
QMA4sIOdj5PzsyTkxeXROkjDLLPqzuGgmU3AoQIDAQABo4IBijCCAYYwDAYDVR0TAQH/BAIw
ADARBglghkgBhvhCAQEEBAMCBaAwDgYDVR0PAQH/BAQDAgSwMB0GA1UdJQQWMBQGCCsGAQUF
BwMCBggrBgEFBQcDBDAdBgNVHQ4EFgQUKpghPosFJNPBR3gyoCI68IbfhuIwHwYDVR0jBBgw
FoAUfxDNXG6symsSC+Cph4gRlbwwF2gwYwYDVR0gBFwwWjBYBg4rBgEEAYHTXwkBAgEBBDBG
MEQGCCsGAQUFBwIBFjhodHRwOi8vd3d3LnJvbWFuaWFuZ3JpZC5yby9kb2NzL1JvbWFuaWFu
R1JJRF9DQV92MS40LnBkZjA6BgNVHR8EMzAxMC+gLaArhilodHRwOi8vd3d3LnJvbWFuaWFu
Z3JpZC5yby9jcmwvY3JsLXYyLmRlcjA2BgNVHRIELzAtgQ9ncmlkLWNhQHJvc2Eucm+GGmh0
dHA6Ly93d3cucm9tYW5pYW5ncmlkLnJvMBsGA1UdEQQUMBKBEGZlbGl4QGl0aW0tY2oucm8w
DQYJKoZIhvcNAQEFBQADggEBAD47kwJhx/ISdqRIErw2aJUuphazL9BZGXyYhoxN1miPxiat
i0qd8ir4X4GJGvfXjjppndFyBwfYZW1AOApGkAOGgZKQffp8wjay91ENd/JRNNli3UhpESA7
za2AQCu7DU5/BZ5kDgjUDpDVdQ2QIECuO8J+CUxI9ziIaOROo/M5p72F6tIcKF2Ps9bplbCL
FcM+B7rZ8JWOhTHzydpaLnseKaNkNkxop9XGKR25xztf6JENU3gu0fvpTTsVkI1fI15Ifsl2
mzucyJccs9mNt5MfNY0fULWi1t6Lb6+9CS2yVm+sGLPQK1xh2Jmi4O2tYl0XmffSa0Zavn/6
kbxcwDgxggJ5MIICdQIBATCBgzB9MRIwEAYKCZImiZPyLGQBGRMCUk8xHDAaBgoJkiaJk/Is
ZAEZEwxSb21hbmlhbkdSSUQxDTALBgNVBAoTBFJPU0ExIDAeBgNVBAsTF0NlcnRpZmljYXRp
b24gQXV0aG9yaXR5MRgwFgYDVQQDEw9Sb21hbmlhbkdSSUQgQ0ECAgMmMAkGBSsOAwIaBQCg
gcswGAYJKoZIhvcNAQkDMQsGCSqGSIb3DQEHATAcBgkqhkiG9w0BCQUxDxcNMTMwNTEzMDUz
NzQ5WjAjBgkqhkiG9w0BCQQxFgQUw/MgH9cM68c+aWDMfzIfy3BmQVwwbAYJKoZIhvcNAQkP
MV8wXTALBglghkgBZQMEASowCwYJYIZIAWUDBAECMAoGCCqGSIb3DQMHMA4GCCqGSIb3DQMC
AgIAgDANBggqhkiG9w0DAgIBQDAHBgUrDgMCBzANBggqhkiG9w0DAgIBKDANBgkqhkiG9w0B
AQEFAASCAQBb/ES0DsZ3IbFia2G2edH7+9ZhAUth38aJR3UsasUOpFbdRXnbaKMHDoYmdIKK
GesBvzzF9x4snFTQUurrStMbzHiQZh5Qpipa3+ZVPh/Taet3joL3V8tRX4J31F+Plu/LOzjD
FU3gdzupNpGrJYtWPQz8l8UnNppmsryr5MtGszA5zlIodidYy3WVH42GP00tksnsUbons2DQ
1vLHqEe3O5kIEklNB2zJslzlrn3JboZZRH6/lS6SmCcEAfvlM0B/a651bSfLDZWA3BBcnJIw
AgQSMEBh3aQHvmtQlfQQ2dLQ7LXw8IBbncn4b/CLas20OphROa+BbxVdrwPCX1wnAAAAAAAA

--------------ms030703060208060304020608--
=========================================================================
Date:         Mon, 13 May 2013 08:01:38 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Jean-Michel Barbet <[log in to unmask]>
Subject:      Re: unable to run jobs on emi3
Comments: cc: Felix Farcas <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Message-ID:  <[log in to unmask]>

On 05/13/2013 07:37 AM, Felix Farcas wrote:
> Now the jobs are running but the are staying in run mode forever.
>
> Do i have the possibility to look where the job is hanging?

Hello Felix,

The problem is with ssh hostbased authentication when the jobs tries
to get its input files from the CREAM CE.

=> Check that from the ifops06 account (and others) on a worker node
    you are able to access the CREAM CE without a password and fix it
    if not.

See scripts :
/etc/cron.d/edg-pbs-knownhosts and /etc/cron.d/edg-pbs-shostsequiv
on the CREAM CE, as they help setting the hostbased authentication.

JM


-- 
------------------------------------------------------------------------
Jean-michel BARBET                    | Tel: +33 (0)2 51 85 84 86
Laboratoire SUBATECH Nantes France    | Fax: +33 (0)2 51 85 84 79
CNRS-IN2P3/Ecole des Mines/Universite | E-Mail: [log in to unmask]
------------------------------------------------------------------------
=========================================================================
Date:         Thu, 16 May 2013 16:33:24 +0500
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Fawad Saeed <[log in to unmask]>
Subject:      errors in apel publisher-- "java.lang.OutOfMemoryError: GC
              overhead limit exceeded"
In-Reply-To:  <[log in to unmask]>
Content-Type: multipart/alternative;
              boundary="_000_F92619F27B4E4645B83CC1BD68C0B8BA01E7448D71C0mailncpedup_"
MIME-Version: 1.0
Message-ID:  <[log in to unmask]>

--_000_F92619F27B4E4645B83CC1BD68C0B8BA01E7448D71C0mailncpedup_
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

Dear All,

The apel publisher shows the following exception under apel log. Thus, our =
CEs were not able to publish the accounting results since 76 days.



+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++=
+++++++++++++++++++++++++

Thu May 16 09:43:18 UTC 2013: apel-publisher - Server Record Count: Record/=
s found site NCP-LCG2: 0

Thu May 16 09:43:18 UTC 2013: apel-publisher - Detected missing records, re=
publishing data starting from: 2013-02-28 21:52:02

Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit ex=
ceeded

        at com.mysql.jdbc.SingleByteCharsetConverter.toString(SingleByteCha=
rsetConverter.java:330)

        at com.mysql.jdbc.ResultSetRow.getString(ResultSetRow.java:797)

        at com.mysql.jdbc.ByteArrayRow.getString(ByteArrayRow.java:72)

        at com.mysql.jdbc.ResultSetImpl.getStringInternal(ResultSetImpl.jav=
a:5700)

        at com.mysql.jdbc.ResultSetImpl.getString(ResultSetImpl.java:5577)

        at com.mysql.jdbc.ResultSetImpl.getString(ResultSetImpl.java:5617)

        at org.glite.apel.core.db.MySQLImpl.convertToAccounting(Unknown Sou=
rce)

        at org.glite.apel.core.db.MySQLImpl.getAccountingRecords(Unknown So=
urce)

        at org.glite.apel.publisher.AccountManager.publishRecords(Unknown S=
ource)

        at org.glite.apel.publisher.AccountManager.chkArchivedTuples(Unknow=
n Source)

        at org.glite.apel.publisher.AccountManager.run(Unknown Source)

        at org.glite.apel.publisher.ApelPublisher.runJoinProcessor(Unknown =
Source)

        at org.glite.apel.publisher.ApelPublisher.run(Unknown Source)

        at org.glite.apel.publisher.ApelPublisher.main(Unknown Source)

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++=
+++++++++++++++++++++++++++++





Any idea what is the problem behind these exceptions?

Thanks in advance.





Regards

Fawad Saeed

NCP-LCG2

________________________________
Disclaimer: This email and any attachments may contain confidential materia=
l and is solely for the use of the intended recipient(s). If you have recei=
ved this email in error, please notify the sender immediately and delete th=
is email. If you are not the intended recipient(s), you must not use, retai=
n or disclose any information contained in this email. Any views or opinion=
s are solely those of the sender and do not necessarily represent those of =
National Centre for Physics (NCP). NCP does accept responsibility for any e=
rrors or omissions that are present in the message, or any attachment, that=
 have arisen as a result of email transmission.

--_000_F92619F27B4E4645B83CC1BD68C0B8BA01E7448D71C0mailncpedup_
Content-Type: text/html; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

<html xmlns:o=3D"urn:schemas-microsoft-com:office:office" xmlns:w=3D"urn:sc=
hemas-microsoft-com:office:word" xmlns=3D"http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dus-ascii"=
>
<meta name=3D"Generator" content=3D"Microsoft Word 11 (filtered medium)">
<style>
<!--
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0in;
	margin-bottom:.0001pt;
	font-size:12.0pt;
	font-family:"Times New Roman";}
a:link, span.MsoHyperlink
	{color:blue;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{color:purple;
	text-decoration:underline;}
p.MsoPlainText, li.MsoPlainText, div.MsoPlainText
	{margin:0in;
	margin-bottom:.0001pt;
	font-size:10.0pt;
	font-family:"Courier New";}
@page Section1
	{size:8.5in 11.0in;
	margin:1.0in 77.95pt 1.0in 77.95pt;}
div.Section1
	{page:Section1;}
-->
</style>
</head>
<body lang=3D"EN-US" link=3D"blue" vlink=3D"purple">
<div class=3D"Section1">
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt">Dear All,<o:p></o:p></span></font></p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt">The apel publisher shows the following exception under apel log. Th=
us, our CEs were not able to publish the accounting results since 76 days.
<o:p></o:p></span></font></p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt"><o:p>&nbsp;</o:p></span></font></p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt">&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#=
43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#=
43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#=
43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#=
43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#=
43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#=
43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;<o:p></o:p></span=
></font></p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt">Thu May 16 09:43:18 UTC 2013: apel-publisher - Server Record Count:=
 Record/s found site NCP-LCG2: 0<o:p></o:p></span></font></p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt">Thu May 16 09:43:18 UTC 2013: apel-publisher - Detected missing rec=
ords, republishing data starting from: 2013-02-28 21:52:02<o:p></o:p></span=
></font></p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt">Exception in thread &quot;main&quot; java.lang.OutOfMemoryError: GC=
 overhead limit exceeded<o:p></o:p></span></font></p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at com.mysql.jdbc.Single=
ByteCharsetConverter.toString(SingleByteCharsetConverter.java:330)<o:p></o:=
p></span></font></p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at com.mysql.jdbc.Result=
SetRow.getString(ResultSetRow.java:797)<o:p></o:p></span></font></p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at com.mysql.jdbc.ByteAr=
rayRow.getString(ByteArrayRow.java:72)<o:p></o:p></span></font></p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at com.mysql.jdbc.Result=
SetImpl.getStringInternal(ResultSetImpl.java:5700)<o:p></o:p></span></font>=
</p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at com.mysql.jdbc.Result=
SetImpl.getString(ResultSetImpl.java:5577)<o:p></o:p></span></font></p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at com.mysql.jdbc.Result=
SetImpl.getString(ResultSetImpl.java:5617)<o:p></o:p></span></font></p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at org.glite.apel.core.d=
b.MySQLImpl.convertToAccounting(Unknown Source)<o:p></o:p></span></font></p=
>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at org.glite.apel.core.d=
b.MySQLImpl.getAccountingRecords(Unknown Source)<o:p></o:p></span></font></=
p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at org.glite.apel.publis=
her.AccountManager.publishRecords(Unknown Source)<o:p></o:p></span></font><=
/p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at org.glite.apel.publis=
her.AccountManager.chkArchivedTuples(Unknown Source)<o:p></o:p></span></fon=
t></p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at org.glite.apel.publis=
her.AccountManager.run(Unknown Source)<o:p></o:p></span></font></p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at org.glite.apel.publis=
her.ApelPublisher.runJoinProcessor(Unknown Source)<o:p></o:p></span></font>=
</p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at org.glite.apel.publis=
her.ApelPublisher.run(Unknown Source)<o:p></o:p></span></font></p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at org.glite.apel.publis=
her.ApelPublisher.main(Unknown Source)<o:p></o:p></span></font></p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt">&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#=
43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#=
43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#=
43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#=
43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#=
43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#=
43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#43;&#=
43;<o:p></o:p></span></font></p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt"><o:p>&nbsp;</o:p></span></font></p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt"><o:p>&nbsp;</o:p></span></font></p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt">Any idea what is the problem behind these exceptions?<o:p></o:p></s=
pan></font></p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt">Thanks in advance.<o:p></o:p></span></font></p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt"><o:p>&nbsp;</o:p></span></font></p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt"><o:p>&nbsp;</o:p></span></font></p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt">Regards<o:p></o:p></span></font></p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt">Fawad Saeed<o:p></o:p></span></font></p>
<p class=3D"MsoPlainText"><font size=3D"2" face=3D"Courier New"><span style=
=3D"font-size:
10.0pt">NCP-LCG2<o:p></o:p></span></font></p>
</div>
<br>
<hr>
<font face=3D"Arial" color=3D"Gray" size=3D"1">Disclaimer: This email and a=
ny attachments may contain confidential material and is solely for the use =
of the intended recipient(s). If you have received this email in error, ple=
ase notify the sender immediately and
 delete this email. If you are not the intended recipient(s), you must not =
use, retain or disclose any information contained in this email. Any views =
or opinions are solely those of the sender and do not necessarily represent=
 those of National Centre for Physics
 (NCP). NCP does accept responsibility for any errors or omissions that are=
 present in the message, or any attachment, that have arisen as a result of=
 email transmission.<br>
</font>
</body>
</html>

--_000_F92619F27B4E4645B83CC1BD68C0B8BA01E7448D71C0mailncpedup_--
=========================================================================
Date:         Thu, 16 May 2013 12:43:31 +0100
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Daniela Bauer <[log in to unmask]>
Subject:      Re: errors in apel publisher-- "java.lang.OutOfMemoryError: GC
              overhead limit exceeded"
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary=f46d043c7e305c69fb04dcd4619b
Message-ID:  <[log in to unmask]>

--f46d043c7e305c69fb04dcd4619b
Content-Type: text/plain; charset=ISO-8859-1

Hi,

try looking in
/etc/glite-apel-publisher/publisher-config-yaim.xml for something like:

   <!-- Number of records selected each time
         Modify this value if OutOfMemory error
         appears in the Publisher
         approx 150000 records per 512Mb memory -->
    <Limit>100000</Limit>

and modify accordingly.

Given that you are missing several months worth of data, you might also
want to try the gap publisher.

Cheers,
Daniela



On 16 May 2013 12:33, Fawad Saeed <[log in to unmask]> wrote:

>  Dear All,****
>
> The apel publisher shows the following exception under apel log. Thus, our
> CEs were not able to publish the accounting results since 76 days. ****
>
> ** **
>
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> ****
>
> Thu May 16 09:43:18 UTC 2013: apel-publisher - Server Record Count:
> Record/s found site NCP-LCG2: 0****
>
> Thu May 16 09:43:18 UTC 2013: apel-publisher - Detected missing records,
> republishing data starting from: 2013-02-28 21:52:02****
>
> Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit
> exceeded****
>
>         at
> com.mysql.jdbc.SingleByteCharsetConverter.toString(SingleByteCharsetConverter.java:330)
> ****
>
>         at com.mysql.jdbc.ResultSetRow.getString(ResultSetRow.java:797)***
> *
>
>         at com.mysql.jdbc.ByteArrayRow.getString(ByteArrayRow.java:72)****
>
>         at
> com.mysql.jdbc.ResultSetImpl.getStringInternal(ResultSetImpl.java:5700)***
> *
>
>         at com.mysql.jdbc.ResultSetImpl.getString(ResultSetImpl.java:5577)
> ****
>
>         at com.mysql.jdbc.ResultSetImpl.getString(ResultSetImpl.java:5617)
> ****
>
>         at org.glite.apel.core.db.MySQLImpl.convertToAccounting(Unknown
> Source)****
>
>         at org.glite.apel.core.db.MySQLImpl.getAccountingRecords(Unknown
> Source)****
>
>         at org.glite.apel.publisher.AccountManager.publishRecords(Unknown
> Source)****
>
>         at
> org.glite.apel.publisher.AccountManager.chkArchivedTuples(Unknown Source)*
> ***
>
>         at org.glite.apel.publisher.AccountManager.run(Unknown Source)****
>
>         at org.glite.apel.publisher.ApelPublisher.runJoinProcessor(Unknown
> Source)****
>
>         at org.glite.apel.publisher.ApelPublisher.run(Unknown Source)****
>
>         at org.glite.apel.publisher.ApelPublisher.main(Unknown Source)****
>
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> ****
>
> ** **
>
> ** **
>
> Any idea what is the problem behind these exceptions?****
>
> Thanks in advance.****
>
> ** **
>
> ** **
>
> Regards****
>
> Fawad Saeed****
>
> NCP-LCG2****
>
> ------------------------------
> Disclaimer: This email and any attachments may contain confidential
> material and is solely for the use of the intended recipient(s). If you
> have received this email in error, please notify the sender immediately and
> delete this email. If you are not the intended recipient(s), you must not
> use, retain or disclose any information contained in this email. Any views
> or opinions are solely those of the sender and do not necessarily represent
> those of National Centre for Physics (NCP). NCP does accept responsibility
> for any errors or omissions that are present in the message, or any
> attachment, that have arisen as a result of email transmission.
>



-- 
Sent from the pit of despair

-----------------------------------------------------------
[log in to unmask]
HEP Group/Physics Dep
Imperial College
London, SW7 2BW
Tel: +44-(0)20-75947810
http://www.hep.ph.ic.ac.uk/~dbauer/

--f46d043c7e305c69fb04dcd4619b
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div><div><div><div>Hi,<br><br></div>try looking in <br>/e=
tc/glite-apel-publisher/publisher-config-yaim.xml for something like:<br><b=
r>=A0=A0 &lt;!-- Number of records selected each time<br>=A0=A0=A0=A0=A0=A0=
=A0=A0 Modify this value if OutOfMemory error<br>

=A0=A0=A0=A0=A0=A0=A0=A0 appears in the Publisher=A0=A0=A0 <br>=A0=A0=A0=A0=
=A0=A0=A0=A0 approx 150000 records per 512Mb memory --&gt;<br>=A0=A0=A0 &lt=
;Limit&gt;100000&lt;/Limit&gt;<br><br></div>and modify accordingly.<br><br>=
</div><div>Given that you are missing several months worth of data, you mig=
ht also want to try the gap publisher.<br>

</div><div><br></div>Cheers,<br></div>Daniela<br><div><div><div><br></div><=
/div></div></div><div class=3D"gmail_extra"><br><br><div class=3D"gmail_quo=
te">On 16 May 2013 12:33, Fawad Saeed <span dir=3D"ltr">&lt;<a href=3D"mail=
to:[log in to unmask]" target=3D"_blank">[log in to unmask]</a>&gt;=
</span> wrote:<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">





<div link=3D"blue" vlink=3D"purple" lang=3D"EN-US">
<div>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt">Dear All,<u>=
</u><u></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt">The apel pub=
lisher shows the following exception under apel log. Thus, our CEs were not=
 able to publish the accounting results since 76 days.
<u></u><u></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt"><u></u>=A0<u=
></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt">++++++++++++=
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++=
+++++++++++++<u></u><u></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt">Thu May 16 0=
9:43:18 UTC 2013: apel-publisher - Server Record Count: Record/s found site=
 NCP-LCG2: 0<u></u><u></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt">Thu May 16 0=
9:43:18 UTC 2013: apel-publisher - Detected missing records, republishing d=
ata starting from: 2013-02-28 21:52:02<u></u><u></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt">Exception in=
 thread &quot;main&quot; java.lang.OutOfMemoryError: GC overhead limit exce=
eded<u></u><u></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt">=A0=A0=A0=A0=
=A0=A0=A0 at com.mysql.jdbc.SingleByteCharsetConverter.toString(SingleByteC=
harsetConverter.java:330)<u></u><u></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt">=A0=A0=A0=A0=
=A0=A0=A0 at com.mysql.jdbc.ResultSetRow.getString(ResultSetRow.java:797)<u=
></u><u></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt">=A0=A0=A0=A0=
=A0=A0=A0 at com.mysql.jdbc.ByteArrayRow.getString(ByteArrayRow.java:72)<u>=
</u><u></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt">=A0=A0=A0=A0=
=A0=A0=A0 at com.mysql.jdbc.ResultSetImpl.getStringInternal(ResultSetImpl.j=
ava:5700)<u></u><u></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt">=A0=A0=A0=A0=
=A0=A0=A0 at com.mysql.jdbc.ResultSetImpl.getString(ResultSetImpl.java:5577=
)<u></u><u></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt">=A0=A0=A0=A0=
=A0=A0=A0 at com.mysql.jdbc.ResultSetImpl.getString(ResultSetImpl.java:5617=
)<u></u><u></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt">=A0=A0=A0=A0=
=A0=A0=A0 at org.glite.apel.core.db.MySQLImpl.convertToAccounting(Unknown S=
ource)<u></u><u></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt">=A0=A0=A0=A0=
=A0=A0=A0 at org.glite.apel.core.db.MySQLImpl.getAccountingRecords(Unknown =
Source)<u></u><u></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt">=A0=A0=A0=A0=
=A0=A0=A0 at org.glite.apel.publisher.AccountManager.publishRecords(Unknown=
 Source)<u></u><u></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt">=A0=A0=A0=A0=
=A0=A0=A0 at org.glite.apel.publisher.AccountManager.chkArchivedTuples(Unkn=
own Source)<u></u><u></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt">=A0=A0=A0=A0=
=A0=A0=A0 at org.glite.apel.publisher.AccountManager.run(Unknown Source)<u>=
</u><u></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt">=A0=A0=A0=A0=
=A0=A0=A0 at org.glite.apel.publisher.ApelPublisher.runJoinProcessor(Unknow=
n Source)<u></u><u></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt">=A0=A0=A0=A0=
=A0=A0=A0 at org.glite.apel.publisher.ApelPublisher.run(Unknown Source)<u><=
/u><u></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt">=A0=A0=A0=A0=
=A0=A0=A0 at org.glite.apel.publisher.ApelPublisher.main(Unknown Source)<u>=
</u><u></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt">++++++++++++=
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++=
+++++++++++++++++<u></u><u></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt"><u></u>=A0<u=
></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt"><u></u>=A0<u=
></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt">Any idea wha=
t is the problem behind these exceptions?<u></u><u></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt">Thanks in ad=
vance.<u></u><u></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt"><u></u>=A0<u=
></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt"><u></u>=A0<u=
></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt">Regards<u></=
u><u></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt">Fawad Saeed<=
u></u><u></u></span></font></p>
<p><font face=3D"Courier New"><span style=3D"font-size:10.0pt">NCP-LCG2<u><=
/u><u></u></span></font></p>
</div>
<br>
<hr>
<font color=3D"Gray" face=3D"Arial" size=3D"1">Disclaimer: This email and a=
ny attachments may contain confidential material and is solely for the use =
of the intended recipient(s). If you have received this email in error, ple=
ase notify the sender immediately and
 delete this email. If you are not the intended recipient(s), you must not =
use, retain or disclose any information contained in this email. Any views =
or opinions are solely those of the sender and do not necessarily represent=
 those of National Centre for Physics
 (NCP). NCP does accept responsibility for any errors or omissions that are=
 present in the message, or any attachment, that have arisen as a result of=
 email transmission.<br>
</font>
</div>

</blockquote></div><br><br clear=3D"all"><br>-- <br><div dir=3D"ltr">Sent f=
rom the pit of despair<br><br>---------------------------------------------=
--------------<br><a href=3D"mailto:[log in to unmask]" target=3D=
"_blank">[log in to unmask]</a><br>

HEP Group/Physics Dep<br>Imperial College<br>London, SW7 2BW<br>Tel: +44-(0=
)20-75947810<br><a href=3D"http://www.hep.ph.ic.ac.uk/~dbauer/" target=3D"_=
blank">http://www.hep.ph.ic.ac.uk/~dbauer/</a></div>
</div>

--f46d043c7e305c69fb04dcd4619b--
=========================================================================
Date:         Fri, 17 May 2013 13:59:32 +0300
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Felix Farcas <[log in to unmask]>
Subject:      Authorization errorglexec error: [gLExec]: LCAS failed
MIME-Version: 1.0
Content-Type: multipart/signed; protocol="application/pkcs7-signature";
              micalg=sha1; boundary="------------ms040601080501010305030106"
Message-ID:  <[log in to unmask]>

This is a cryptographically signed message in MIME format.

--------------ms040601080501010305030106
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: quoted-printable

Hello

I have SL 6.4 and emi3  installed on the machine creamce.itim-cj.ro on=20
the site RO-14-ITIM.

I have the following error: But only beginning from today

Yesterday everything was fine.

The software installed is:
glexec-0.9.8-1.el6.x86_64
lcas-1.3.19-2.el6.x86_64
lcas-lcmaps-gt4-interface-0.2.6-1.el6.x86_64
lcas-plugins-check-executable-1.2.4-2.el6.x86_64
lcas-plugins-voms-1.3.11-1.el6.x86_64
lcas-plugins-basic-1.3.6-2.el6.x86_64

On my Worknode and cream the logs of glexec and lcas are empty. In any=20
other place I look I see nothing of failing jobs.

Testing from: nagios.grid.ici.ro
DN: /DC=3DRO/DC=3DRomanianGRID/O=3DICI/CN=3DAlexandru=20
Stanciu/CN=3Dproxy/CN=3Dproxy/CN=3Dproxy/CN=3Dproxy
VOMS FQANs: /ops/Role=3DNULL/Capability=3DNULL,=20
/ops/NGI/Role=3DNULL/Capability=3DNULL,=20
/ops/NGI/Romania/Role=3DNULL/Capability=3DNULL
Discovered endpoints:
ecream.itim-cj.ro:8443/cream-pbs-ops
Endpoint to be used: ecream.itim-cj.ro:8443/cream-pbs-ops
Job submission failed.
2013-05-17 12:57:21,939 FATAL - Authorization failure: Authorization=20
error: Failed to get the local user id via glexec:  glexec error:=20
[gLExec]: LCAS failed. The reason can be found in the syslog.

Where may I look?

Thank you
Felix

--=20
Dr. Ing. Farcas Felix
National Institute of Research and Development
of Isotopic and Molecular Technology,
IT - Department - Cluj-Napoca, Romania
yahoo id: felixfarcas
skype id: felix.farcas
mobile: +40-742-195323



--------------ms040601080501010305030106
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature

MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIE4TCC
BN0wggPFoAMCAQICAgMmMA0GCSqGSIb3DQEBBQUAMH0xEjAQBgoJkiaJk/IsZAEZEwJSTzEc
MBoGCgmSJomT8ixkARkTDFJvbWFuaWFuR1JJRDENMAsGA1UEChMEUk9TQTEgMB4GA1UECxMX
Q2VydGlmaWNhdGlvbiBBdXRob3JpdHkxGDAWBgNVBAMTD1JvbWFuaWFuR1JJRCBDQTAeFw0x
MjEwMjkxMzU4MzhaFw0xMzEwMjkxMzU4MzhaMFgxEjAQBgoJkiaJk/IsZAEZFgJSTzEcMBoG
CgmSJomT8ixkARkWDFJvbWFuaWFuR1JJRDENMAsGA1UEChMESVRJTTEVMBMGA1UEAxMMRmVs
aXggRmFyY2FzMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAvqyqR+N6ICid7NWE
iRkX4xiZWcIyCQk+7WKv5aFxbNliVNzMj0AgFl8sReFQozJA5BLNn62ib416k6SeDntwkVfC
YgJ6XvvgosmheG09KHVcldmLa4l6wMZrZtXD11kZk/bJKpb8HaOWAacdhiiJfkec0D37vlVw
iNe9nnrKkILhbsNI3OgUz0lgBZ6eR6E9uwOWUdwwLPxP67Il2ri2BSzdAm2v/ZYcAqz+ngoy
8lQNs7q32Ff4yMUDPteF6drjb6tbaN9X4fPXLjPfMQCIDC3k5Yns1PRFeNpxKpeqQUfLX/N7
QMA4sIOdj5PzsyTkxeXROkjDLLPqzuGgmU3AoQIDAQABo4IBijCCAYYwDAYDVR0TAQH/BAIw
ADARBglghkgBhvhCAQEEBAMCBaAwDgYDVR0PAQH/BAQDAgSwMB0GA1UdJQQWMBQGCCsGAQUF
BwMCBggrBgEFBQcDBDAdBgNVHQ4EFgQUKpghPosFJNPBR3gyoCI68IbfhuIwHwYDVR0jBBgw
FoAUfxDNXG6symsSC+Cph4gRlbwwF2gwYwYDVR0gBFwwWjBYBg4rBgEEAYHTXwkBAgEBBDBG
MEQGCCsGAQUFBwIBFjhodHRwOi8vd3d3LnJvbWFuaWFuZ3JpZC5yby9kb2NzL1JvbWFuaWFu
R1JJRF9DQV92MS40LnBkZjA6BgNVHR8EMzAxMC+gLaArhilodHRwOi8vd3d3LnJvbWFuaWFu
Z3JpZC5yby9jcmwvY3JsLXYyLmRlcjA2BgNVHRIELzAtgQ9ncmlkLWNhQHJvc2Eucm+GGmh0
dHA6Ly93d3cucm9tYW5pYW5ncmlkLnJvMBsGA1UdEQQUMBKBEGZlbGl4QGl0aW0tY2oucm8w
DQYJKoZIhvcNAQEFBQADggEBAD47kwJhx/ISdqRIErw2aJUuphazL9BZGXyYhoxN1miPxiat
i0qd8ir4X4GJGvfXjjppndFyBwfYZW1AOApGkAOGgZKQffp8wjay91ENd/JRNNli3UhpESA7
za2AQCu7DU5/BZ5kDgjUDpDVdQ2QIECuO8J+CUxI9ziIaOROo/M5p72F6tIcKF2Ps9bplbCL
FcM+B7rZ8JWOhTHzydpaLnseKaNkNkxop9XGKR25xztf6JENU3gu0fvpTTsVkI1fI15Ifsl2
mzucyJccs9mNt5MfNY0fULWi1t6Lb6+9CS2yVm+sGLPQK1xh2Jmi4O2tYl0XmffSa0Zavn/6
kbxcwDgxggJ5MIICdQIBATCBgzB9MRIwEAYKCZImiZPyLGQBGRMCUk8xHDAaBgoJkiaJk/Is
ZAEZEwxSb21hbmlhbkdSSUQxDTALBgNVBAoTBFJPU0ExIDAeBgNVBAsTF0NlcnRpZmljYXRp
b24gQXV0aG9yaXR5MRgwFgYDVQQDEw9Sb21hbmlhbkdSSUQgQ0ECAgMmMAkGBSsOAwIaBQCg
gcswGAYJKoZIhvcNAQkDMQsGCSqGSIb3DQEHATAcBgkqhkiG9w0BCQUxDxcNMTMwNTE3MTA1
OTMyWjAjBgkqhkiG9w0BCQQxFgQU/6kqzxwX3nbS2YUWF/U7hWqBA54wbAYJKoZIhvcNAQkP
MV8wXTALBglghkgBZQMEASowCwYJYIZIAWUDBAECMAoGCCqGSIb3DQMHMA4GCCqGSIb3DQMC
AgIAgDANBggqhkiG9w0DAgIBQDAHBgUrDgMCBzANBggqhkiG9w0DAgIBKDANBgkqhkiG9w0B
AQEFAASCAQC1kVhD+roKic/msTP+D4CJjUhTuHagHzo4NJxZUQrfI/zhILDdqUVgJ8eF1OFv
jcNmHyK637u8l4rxDqpVHRbev43M8xB7chZZNKTU7xNg5Q5OzsLKIiT3cS0fFjx0upj09kE6
09FrXOK9oZ3P3MttGI5POazZ/lbTYej60rcin5Gx3xoz/aEwEO9eOi6VTjFdOqdwYU0xCTjN
OxrVoxkW/ItiwC1SV2cdNQxomUYSI0s5gvKHJt/gKt63vBKOAznw0IoBDsf85o/PkCPO+x0r
BRCgHAIz4m50OpRbKeZ7cb8SDmjXIz3GysuZUypft5BCdnRFCVDqefcKH8XZuzpHAAAAAAAA

--------------ms040601080501010305030106--
=========================================================================
Date:         Fri, 17 May 2013 04:01:05 -0700
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Muhammad Waqar <[log in to unmask]>
Subject:      BrokerHelper: no compatible resources
MIME-Version: 1.0
Content-Type: multipart/alternative;
              boundary="716127369-1937839991-1368788465=:6472"
Message-ID:  <[log in to unmask]>

--716127369-1937839991-1368788465=:6472
Content-Type: text/plain; charset=us-ascii

Dear All,

Nagios core issued a ticket with the following information's:

""
CRITICAL: [Waiting->Cancelled [timeout/dropped]] 'BrokerHelper: no 
compatible resources'. 
https://rocwms01.grid.sinica.edu.tw:9000/m5M_aBHJtu4xrYXJ0FfUSw
CRITICAL:
 [Waiting->Cancelled [timeout/dropped]] 'BrokerHelper: no compatible 
resources'. 
https://rocwms01.grid.sinica.edu.tw:9000/m5M_aBHJtu4xrYXJ0FfUSw

Testing from: rocnagios
DN: /C=TW/O=AS/OU=GRID/CN=Liaw SyueYi 182693/CN=proxy/CN=proxy/CN=proxy/CN=proxy
VOMS
 FQANs: /ops/Role=lcgadmin/Capability=NULL, 
/ops/ROC/Role=NULL/Capability=NULL, 
/ops/ROC/AsiaPacific/Role=NULL/Capability=NULL, 
/ops/Role=NULL/Capability=NULL
glite-wms-job-status https://rocwms01.grid.sinica.edu.tw:9000/m5M_aBHJtu4xrYXJ0FfUSw


======================= glite-wms-job-status Success =====================
BOOKKEEPING INFORMATION:

Status info for the Job : https://rocwms01.grid.sinica.edu.tw:9000/m5M_aBHJtu4xrYXJ0FfUSw
Current Status:     Waiting 
Status Reason:      BrokerHelper: no compatible resources
Submitted:          Fri May 17 08:52:19 2013 UTC""

Any idea what is the reason and how to fix this problem?

Regards 
M.Waqar
Pk-CIIT

--716127369-1937839991-1368788465=:6472
Content-Type: text/html; charset=us-ascii

<table cellspacing="0" cellpadding="0" border="0" ><tr><td valign="top" style="font: inherit;">Dear All,<br><br><b>Nagios core issued a ticket with the following information's:</b><br><br>""<br>CRITICAL: [Waiting-&gt;Cancelled [timeout/dropped]] 'BrokerHelper: no 
compatible resources'. 
https://rocwms01.grid.sinica.edu.tw:9000/m5M_aBHJtu4xrYXJ0FfUSw<br>CRITICAL:
 [Waiting-&gt;Cancelled [timeout/dropped]] 'BrokerHelper: no compatible 
resources'. 
https://rocwms01.grid.sinica.edu.tw:9000/m5M_aBHJtu4xrYXJ0FfUSw<br><br>Testing from: rocnagios<br>DN: /C=TW/O=AS/OU=GRID/CN=Liaw SyueYi 182693/CN=proxy/CN=proxy/CN=proxy/CN=proxy<br>VOMS
 FQANs: /ops/Role=lcgadmin/Capability=NULL, 
/ops/ROC/Role=NULL/Capability=NULL, 
/ops/ROC/AsiaPacific/Role=NULL/Capability=NULL, 
/ops/Role=NULL/Capability=NULL<br>glite-wms-job-status https://rocwms01.grid.sinica.edu.tw:9000/m5M_aBHJtu4xrYXJ0FfUSw<br><br><br>======================= glite-wms-job-status Success =====================<br>BOOKKEEPING INFORMATION:<br><br>Status info for the Job : https://rocwms01.grid.sinica.edu.tw:9000/m5M_aBHJtu4xrYXJ0FfUSw<br>Current Status:     Waiting <br>Status Reason:      BrokerHelper: no compatible resources<br>Submitted:          Fri May 17 08:52:19 2013 UTC""<br><br>Any idea what is the reason and how to fix this problem?<br><br>Regards <br>M.Waqar<br>Pk-CIIT<br></td></tr></table>
--716127369-1937839991-1368788465=:6472--
=========================================================================
Date:         Fri, 17 May 2013 11:26:03 +0000
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Stephen Burke <[log in to unmask]>
Subject:      Re: BrokerHelper: no compatible resources
In-Reply-To:  <[log in to unmask]>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Message-ID:  <[log in to unmask]>

LHC Computer Grid - Rollout [mailto:[log in to unmask]] On Behalf O=
f Muhammad Waqar said:
> Any idea what is the reason and how to fix this problem?

Your site BDII is not publishing any information related to a CE (or SE for=
 that matter), all it has is a VOBOX and the site BDII itself.

ldapsearch -x -h cern-56-24-243.comsats.edu.pk -p 2170 -b mds-vo-name=3DPK-=
CIIT,o=3Dgrid objectclass=3DGlueCE
# extended LDIF
#
# LDAPv3
# base <mds-vo-name=3DPK-CIIT,o=3Dgrid> with scope subtree
# filter: objectclass=3DGlueCE
# requesting: ALL
#

# search result
search: 2
result: 0 Success

# numResponses: 1

Stephen
-- 
Scanned by iCritical.
=========================================================================
Date:         Fri, 17 May 2013 14:03:13 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Jean-Michel Barbet <[log in to unmask]>
Subject:      Re: Authorization errorglexec error: [gLExec]: LCAS failed
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Message-ID:  <[log in to unmask]>

On 05/17/2013 12:59 PM, Felix Farcas wrote:

> 2013-05-17 12:57:21,939 FATAL - Authorization failure: Authorization
> error: Failed to get the local user id via glexec: glexec error:
> [gLExec]: LCAS failed. The reason can be found in the syslog.

Hi Felix,

If nothing changed, I would check that all daemons on the Argus server
are still alive, on our site it happends sometimes that one of them
(can't remember which one) hangs.


JM


-- 
------------------------------------------------------------------------
Jean-michel BARBET                    | Tel: +33 (0)2 51 85 84 86
Laboratoire SUBATECH Nantes France    | Fax: +33 (0)2 51 85 84 79
CNRS-IN2P3/Ecole des Mines/Universite | E-Mail: [log in to unmask]
------------------------------------------------------------------------
=========================================================================
Date:         Fri, 17 May 2013 14:05:17 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Maarten Litmaath <[log in to unmask]>
Subject:      Re: Authorization errorglexec error: [gLExec]: LCAS failed
Comments: To: Felix Farcas <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Message-ID:  <[log in to unmask]>

Hi Felix,

> 2013-05-17 12:57:21,939 FATAL - Authorization failure: Authorization error:
> Failed to get the local user id via glexec:  glexec error: [gLExec]: LCAS
> failed. The reason can be found in the syslog.
> 
> Where may I look?

As the message says: did you check the syslog, usually /var/log/messages?
=========================================================================
Date:         Fri, 17 May 2013 14:21:54 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Massimo Sgaravatto <[log in to unmask]>
Subject:      Re: Authorization errorglexec error: [gLExec]: LCAS failed
Comments: cc: Maarten Litmaath <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: multipart/signed; protocol="application/pkcs7-signature";
              micalg=sha1; boundary="------------ms070009010401020405050405"
Message-ID:  <[log in to unmask]>

This is a cryptographically signed message in MIME format.

--------------ms070009010401020405050405
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: quoted-printable

On 05/17/2013 02:05 PM, Maarten Litmaath wrote:
> Hi Felix,
>
>> 2013-05-17 12:57:21,939 FATAL - Authorization failure: Authorization e=
rror:
>> Failed to get the local user id via glexec:  glexec error: [gLExec]: L=
CAS
>> failed. The reason can be found in the syslog.
>>
>> Where may I look?
>
> As the message says: did you check the syslog, usually /var/log/message=
s?
>

You might need to increase the verbosity levels in /etc/glexec.conf

Cheers, Massimo


--------------ms070009010401020405050405
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature

MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIEYTCC
BF0wggNFoAMCAQICAmTOMA0GCSqGSIb3DQEBBQUAMC4xCzAJBgNVBAYTAklUMQ0wCwYDVQQK
EwRJTkZOMRAwDgYDVQQDEwdJTkZOIENBMB4XDTEyMTEzMDE0NTk1OVoXDTEzMTEzMDE0NTk1
OVowaTELMAkGA1UEBhMCSVQxDTALBgNVBAoTBElORk4xHTAbBgNVBAsTFFBlcnNvbmFsIENl
cnRpZmljYXRlMQ8wDQYDVQQHEwZQYWRvdmExGzAZBgNVBAMTEk1hc3NpbW8gU2dhcmF2YXR0
bzCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAMckK6UwAfLTnv40njejeQL4I8Da
PGmNTmHPkhxombirUb4aoEmLaLYL7uNPP1K0Sv1x01XmkYNpJqqT4KiXn3tacBMzWmzFBOFq
TOlWFPe55rG8boN5WBbmJCx+xF/K49uzyJvjvbSddevQf1ULeb6Yfa3Il0mln4xGhxxNTEJ6
B+NcDVNWoH9f8Mc2bZdpke7JWj2JbT7OJUmYhIjvxdmO82RIeiSKfObJnkKQNl13v82yr/3r
POLhHhXzGFnM396px1iFsc13z1UXGd12eHxsNES7OV/pntW5QqBumoVX2z95XJYtwFAtVjkw
erB9Qz1pRJzyAWc5FBELynMBiWUCAwEAAaOCAUgwggFEMAwGA1UdEwEB/wQCMAAwDgYDVR0P
AQH/BAQDAgSwMB0GA1UdJQQWMBQGCCsGAQUFBwMCBggrBgEFBQcDBDA9BgNVHR8ENjA0MDKg
MKAuhixodHRwOi8vc2VjdXJpdHkuZmkuaW5mbi5pdC9DQS9JTkZOQ0FfY3JsLmRlcjAlBgNV
HSAEHjAcMAwGCisGAQQB0SMKAQcwDAYKKoZIhvdMBQICATAdBgNVHQ4EFgQUKja1C24r8UX8
cHGJa7lkuKIXwqgwVgYDVR0jBE8wTYAU0WLzs3dyyC778nkabzdOJ58T1SChMqQwMC4xCzAJ
BgNVBAYTAklUMQ0wCwYDVQQKEwRJTkZOMRAwDgYDVQQDEwdJTkZOIENBggEAMCgGA1UdEQQh
MB+BHW1hc3NpbW8uc2dhcmF2YXR0b0BwZC5pbmZuLml0MA0GCSqGSIb3DQEBBQUAA4IBAQBa
maN4KnKJ41b0Yhlz4sJ1hOeDtTK8bY5ZPH5vd51GJVzieblPgLY7h+1aoKk6JGuV0vwX6p4N
Rozo5KZI0nWDRn2wi7A1ShK+a8DFSiKM2c2a2QKDm/AmRltFSfvGkz5mV0NiGkmce+t+E02+
vJpedW4QAbNJlNG4RDG1ljzMmcihhhfc2zi53b6uTWB1xWRnqh7FsQ0MSdI3OPcUUGQQKWYz
A++XeO9h7gc5soImcHOU9UyR71lvrAfy3p/SjkA8J285B3dDIKSZeZJ/UhWyX9rrK9frJVrK
YfX+iTvFiCXKeXcKfN+LOsbYd0LT/U9UG7Tk0SSp/Tow2edwxLPoMYICtjCCArICAQEwNDAu
MQswCQYDVQQGEwJJVDENMAsGA1UEChMESU5GTjEQMA4GA1UEAxMHSU5GTiBDQQICZM4wCQYF
Kw4DAhoFAKCCAVcwGAYJKoZIhvcNAQkDMQsGCSqGSIb3DQEHATAcBgkqhkiG9w0BCQUxDxcN
MTMwNTE3MTIyMTU0WjAjBgkqhkiG9w0BCQQxFgQUevJW33hOAtbV8nhqsv5yxe4eWQowQwYJ
KwYBBAGCNxAEMTYwNDAuMQswCQYDVQQGEwJJVDENMAsGA1UEChMESU5GTjEQMA4GA1UEAxMH
SU5GTiBDQQICZM4wRQYLKoZIhvcNAQkQAgsxNqA0MC4xCzAJBgNVBAYTAklUMQ0wCwYDVQQK
EwRJTkZOMRAwDgYDVQQDEwdJTkZOIENBAgJkzjBsBgkqhkiG9w0BCQ8xXzBdMAsGCWCGSAFl
AwQBKjALBglghkgBZQMEAQIwCgYIKoZIhvcNAwcwDgYIKoZIhvcNAwICAgCAMA0GCCqGSIb3
DQMCAgFAMAcGBSsOAwIHMA0GCCqGSIb3DQMCAgEoMA0GCSqGSIb3DQEBAQUABIIBALEmq8dL
QTa40yKt5O2amb+lmKu0rUUXoYP+wfKYGkFa3VeqjEf5uurGhz0/CozBwMgWTRu/BAZnTKwp
lnLttQLkbRhNjlHQukSReuuTVfJ+yL5AXRFNTzLszjiaVYnRyupaDG+QRCqvssnvgpaYNj/u
wpDusRbSGb1uvui4a716lyZ9AOBpJAlWDWDRUcsJ8lYymko1QEXyBNsg3bQWGmuAkxk8zwu9
HH8LuC879oZWa9zYxQukaLzK3+8jaO8MIHyXpHruuRVNH23/kDc++XE/wAFFTtkC9BrQCEtb
b/P8mu1dfNVNBEA5g9kXf8pcgFTtCzn+exjbNMs23pfqx/cAAAAAAAA=
--------------ms070009010401020405050405--
=========================================================================
Date:         Fri, 17 May 2013 15:51:12 +0300
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Felix Farcas <[log in to unmask]>
Subject:      Re: Authorization errorglexec error: [gLExec]: LCAS failed
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: multipart/signed; protocol="application/pkcs7-signature";
              micalg=sha1; boundary="------------ms080703060607080501090705"
Message-ID:  <[log in to unmask]>

This is a cryptographically signed message in MIME format.

--------------ms080703060607080501090705
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: quoted-printable

On 5/17/2013 3:21 PM, Massimo Sgaravatto wrote:
> On 05/17/2013 02:05 PM, Maarten Litmaath wrote:
>> Hi Felix,
>>
>>> 2013-05-17 12:57:21,939 FATAL - Authorization failure: Authorization =

>>> error:
>>> Failed to get the local user id via glexec:  glexec error: [gLExec]: =

>>> LCAS
>>> failed. The reason can be found in the syslog.
>>>
>>> Where may I look?
>>
>> As the message says: did you check the syslog, usually=20
>> /var/log/messages?
>>
>
> You might need to increase the verbosity levels in /etc/glexec.conf
>
> Cheers, Massimo
>

In BDII log I have the following:

argus-pap: unrecognized service
ERROR:lcg-info-dynamic-scheduler:Execution error: Missing LRMS backend=20
command in configuration
argus-pap: unrecognized service
argus-pap: unrecognized service
argus-pap: unrecognized service
argus-pap: unrecognized service
argus-pap: unrecognized service
argus-pap: unrecognized service
argus-pap: unrecognized service
argus-pap: unrecognized service
argus-pap: unrecognized service
argus-pap: unrecognized service
argus-pap: unrecognized service

when try to restart argus-pap it say nothing, at status report if gives=20
me the following message

/etc/init.d/argus-pap status
PAP running!

REgarding the verbosity level may I modify which line from the config fil=
e?

[glexec]
linger =3D no

lcmaps_db_file =3D /etc/lcmaps/lcmaps-glexec.db
lcmaps_log_file =3D /var/log/glexec/lcas_lcmaps.log
lcmaps_debug_level =3D 0
lcmaps_log_level =3D 1

lcas_db_file =3D /etc/lcas/lcas-glexec.db
lcas_log_file =3D /var/log/glexec/lcas_lcmaps.log
lcas_debug_level =3D 0
lcas_log_level =3D 1

log_level =3D 1
user_identity_switch_by =3D lcmaps
user_white_list =3D tomcat
omission_private_key_white_list  =3D tomcat
preserve_env_variables =3D
create_target_proxy =3D no
silent_logging =3D no
log_destination =3D syslog
log_file =3D /var/log/glexec/glexec.log

Thank you
Felix

--=20
Dr. Ing. Farcas Felix
National Institute of Research and Development
of Isotopic and Molecular Technology,
IT - Department - Cluj-Napoca, Romania
yahoo id: felixfarcas
skype id: felix.farcas
mobile: +40-742-195323



--------------ms080703060607080501090705
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature

MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIE4TCC
BN0wggPFoAMCAQICAgMmMA0GCSqGSIb3DQEBBQUAMH0xEjAQBgoJkiaJk/IsZAEZEwJSTzEc
MBoGCgmSJomT8ixkARkTDFJvbWFuaWFuR1JJRDENMAsGA1UEChMEUk9TQTEgMB4GA1UECxMX
Q2VydGlmaWNhdGlvbiBBdXRob3JpdHkxGDAWBgNVBAMTD1JvbWFuaWFuR1JJRCBDQTAeFw0x
MjEwMjkxMzU4MzhaFw0xMzEwMjkxMzU4MzhaMFgxEjAQBgoJkiaJk/IsZAEZFgJSTzEcMBoG
CgmSJomT8ixkARkWDFJvbWFuaWFuR1JJRDENMAsGA1UEChMESVRJTTEVMBMGA1UEAxMMRmVs
aXggRmFyY2FzMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAvqyqR+N6ICid7NWE
iRkX4xiZWcIyCQk+7WKv5aFxbNliVNzMj0AgFl8sReFQozJA5BLNn62ib416k6SeDntwkVfC
YgJ6XvvgosmheG09KHVcldmLa4l6wMZrZtXD11kZk/bJKpb8HaOWAacdhiiJfkec0D37vlVw
iNe9nnrKkILhbsNI3OgUz0lgBZ6eR6E9uwOWUdwwLPxP67Il2ri2BSzdAm2v/ZYcAqz+ngoy
8lQNs7q32Ff4yMUDPteF6drjb6tbaN9X4fPXLjPfMQCIDC3k5Yns1PRFeNpxKpeqQUfLX/N7
QMA4sIOdj5PzsyTkxeXROkjDLLPqzuGgmU3AoQIDAQABo4IBijCCAYYwDAYDVR0TAQH/BAIw
ADARBglghkgBhvhCAQEEBAMCBaAwDgYDVR0PAQH/BAQDAgSwMB0GA1UdJQQWMBQGCCsGAQUF
BwMCBggrBgEFBQcDBDAdBgNVHQ4EFgQUKpghPosFJNPBR3gyoCI68IbfhuIwHwYDVR0jBBgw
FoAUfxDNXG6symsSC+Cph4gRlbwwF2gwYwYDVR0gBFwwWjBYBg4rBgEEAYHTXwkBAgEBBDBG
MEQGCCsGAQUFBwIBFjhodHRwOi8vd3d3LnJvbWFuaWFuZ3JpZC5yby9kb2NzL1JvbWFuaWFu
R1JJRF9DQV92MS40LnBkZjA6BgNVHR8EMzAxMC+gLaArhilodHRwOi8vd3d3LnJvbWFuaWFu
Z3JpZC5yby9jcmwvY3JsLXYyLmRlcjA2BgNVHRIELzAtgQ9ncmlkLWNhQHJvc2Eucm+GGmh0
dHA6Ly93d3cucm9tYW5pYW5ncmlkLnJvMBsGA1UdEQQUMBKBEGZlbGl4QGl0aW0tY2oucm8w
DQYJKoZIhvcNAQEFBQADggEBAD47kwJhx/ISdqRIErw2aJUuphazL9BZGXyYhoxN1miPxiat
i0qd8ir4X4GJGvfXjjppndFyBwfYZW1AOApGkAOGgZKQffp8wjay91ENd/JRNNli3UhpESA7
za2AQCu7DU5/BZ5kDgjUDpDVdQ2QIECuO8J+CUxI9ziIaOROo/M5p72F6tIcKF2Ps9bplbCL
FcM+B7rZ8JWOhTHzydpaLnseKaNkNkxop9XGKR25xztf6JENU3gu0fvpTTsVkI1fI15Ifsl2
mzucyJccs9mNt5MfNY0fULWi1t6Lb6+9CS2yVm+sGLPQK1xh2Jmi4O2tYl0XmffSa0Zavn/6
kbxcwDgxggJ5MIICdQIBATCBgzB9MRIwEAYKCZImiZPyLGQBGRMCUk8xHDAaBgoJkiaJk/Is
ZAEZEwxSb21hbmlhbkdSSUQxDTALBgNVBAoTBFJPU0ExIDAeBgNVBAsTF0NlcnRpZmljYXRp
b24gQXV0aG9yaXR5MRgwFgYDVQQDEw9Sb21hbmlhbkdSSUQgQ0ECAgMmMAkGBSsOAwIaBQCg
gcswGAYJKoZIhvcNAQkDMQsGCSqGSIb3DQEHATAcBgkqhkiG9w0BCQUxDxcNMTMwNTE3MTI1
MTEyWjAjBgkqhkiG9w0BCQQxFgQUmXRP8XrpGlgX0OuRgJ+4CBs2MwEwbAYJKoZIhvcNAQkP
MV8wXTALBglghkgBZQMEASowCwYJYIZIAWUDBAECMAoGCCqGSIb3DQMHMA4GCCqGSIb3DQMC
AgIAgDANBggqhkiG9w0DAgIBQDAHBgUrDgMCBzANBggqhkiG9w0DAgIBKDANBgkqhkiG9w0B
AQEFAASCAQBybK37vsxSRF9AOZFWJKV94f6J9gx+Pjt1D/L9TKyJjPPYKb0tuSSwdi6aHyc0
d3LVSPGIkTRMnSZk81D3eoSeyEYCzEq/c1Sqccrhu638yX8dbs5g8ld1dyGiR+O3dp3hEu85
ArX4a8R4kVk21J1hJ1hShYna/56gaBKCtUQxIlJVE0+Lo6kkSaTgbgR2Gz0OUi+U4BgQ4qxc
MdZICC5oVuTUpdFdpBVR6xVp3qXa4s0vSSuvJJDBcElafskq7wMXcpup3N3Oqx/PZljMgURS
9oiso78MCUsiIQaz/ZOnsdkpsLJvExlkFbiwvTxhfD5+4I0dgIFidqxfhfeqNlLJAAAAAAAA

--------------ms080703060607080501090705--
=========================================================================
Date:         Fri, 17 May 2013 15:06:24 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Mischa Salle <[log in to unmask]>
Subject:      Re: Authorization errorglexec error: [gLExec]: LCAS failed
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: multipart/signed; protocol="application/x-pkcs7-signature";
              micalg=sha1; boundary="7AUc2qLy4jB3hD7Z"
Content-Disposition: inline
Message-ID:  <[log in to unmask]>

--7AUc2qLy4jB3hD7Z
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hi Felix,

for the failing LCAS, you would need to increase the lcas_debug_level
setting in the glexec.conf to let's say 5.
LCAS does not do anything with Argus, only LCMAPS does that.

    Cheers,
    Mischa

On Fri, May 17, 2013 at 03:51:12PM +0300, Felix Farcas wrote:
> On 5/17/2013 3:21 PM, Massimo Sgaravatto wrote:
> >On 05/17/2013 02:05 PM, Maarten Litmaath wrote:
> >>Hi Felix,
> >>
> >>>2013-05-17 12:57:21,939 FATAL - Authorization failure:
> >>>Authorization error:
> >>>Failed to get the local user id via glexec:  glexec error:
> >>>[gLExec]: LCAS
> >>>failed. The reason can be found in the syslog.
> >>>
> >>>Where may I look?
> >>
> >>As the message says: did you check the syslog, usually
> >>/var/log/messages?
> >>
> >
> >You might need to increase the verbosity levels in /etc/glexec.conf
> >
> >Cheers, Massimo
> >
>=20
> In BDII log I have the following:
>=20
> argus-pap: unrecognized service
> ERROR:lcg-info-dynamic-scheduler:Execution error: Missing LRMS
> backend command in configuration
> argus-pap: unrecognized service
> argus-pap: unrecognized service
> argus-pap: unrecognized service
> argus-pap: unrecognized service
> argus-pap: unrecognized service
> argus-pap: unrecognized service
> argus-pap: unrecognized service
> argus-pap: unrecognized service
> argus-pap: unrecognized service
> argus-pap: unrecognized service
> argus-pap: unrecognized service
>=20
> when try to restart argus-pap it say nothing, at status report if
> gives me the following message
>=20
> /etc/init.d/argus-pap status
> PAP running!
>=20
> REgarding the verbosity level may I modify which line from the config fil=
e?
>=20
> [glexec]
> linger =3D no
>=20
> lcmaps_db_file =3D /etc/lcmaps/lcmaps-glexec.db
> lcmaps_log_file =3D /var/log/glexec/lcas_lcmaps.log
> lcmaps_debug_level =3D 0
> lcmaps_log_level =3D 1
>=20
> lcas_db_file =3D /etc/lcas/lcas-glexec.db
> lcas_log_file =3D /var/log/glexec/lcas_lcmaps.log
> lcas_debug_level =3D 0
> lcas_log_level =3D 1
>=20
> log_level =3D 1
> user_identity_switch_by =3D lcmaps
> user_white_list =3D tomcat
> omission_private_key_white_list  =3D tomcat
> preserve_env_variables =3D
> create_target_proxy =3D no
> silent_logging =3D no
> log_destination =3D syslog
> log_file =3D /var/log/glexec/glexec.log
>=20
> Thank you
> Felix
>=20
> --=20
> Dr. Ing. Farcas Felix
> National Institute of Research and Development
> of Isotopic and Molecular Technology,
> IT - Department - Cluj-Napoca, Romania
> yahoo id: felixfarcas
> skype id: felix.farcas
> mobile: +40-742-195323
>=20
>=20



--=20
Nikhef                      Room  H155
Science Park 105            Tel.  +31-20-592 5102
1098 XG Amsterdam           Fax   +31-20-592 5155
The Netherlands             Email [log in to unmask]
  __ .. ... _._. .... ._  ... ._ ._.. ._.. .._..

--7AUc2qLy4jB3hD7Z
Content-Type: application/x-pkcs7-signature
Content-Disposition: attachment; filename="smime.p7s"
Content-Transfer-Encoding: base64

MIIQ5gYJKoZIhvcNAQcCoIIQ1zCCENMCAQExCzAJBgUrDgMCGgUAMAsGCSqGSIb3DQEHAaCC
DlQwggTaMIIDwqADAgECAhBd/1Dq/g9TRoifgEGP50LIMA0GCSqGSIb3DQEBBQUAMIGuMQsw
CQYDVQQGEwJVUzELMAkGA1UECBMCVVQxFzAVBgNVBAcTDlNhbHQgTGFrZSBDaXR5MR4wHAYD
VQQKExVUaGUgVVNFUlRSVVNUIE5ldHdvcmsxITAfBgNVBAsTGGh0dHA6Ly93d3cudXNlcnRy
dXN0LmNvbTE2MDQGA1UEAxMtVVROLVVTRVJGaXJzdC1DbGllbnQgQXV0aGVudGljYXRpb24g
YW5kIEVtYWlsMB4XDTA5MDUxODAwMDAwMFoXDTI4MTIzMTIzNTk1OVowRDELMAkGA1UEBhMC
TkwxDzANBgNVBAoTBlRFUkVOQTEkMCIGA1UEAxMbVEVSRU5BIGVTY2llbmNlIFBlcnNvbmFs
IENBMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAxLwlfc3vWQSrdCcptmOK78Q1
cKFGCK9uUJsbcXiXAoXANf+QGBFm+jM5FJ6kEvc97gUsgef3LS42eEMOObZ/enTDy1U/L7PL
wzdq4ZUDIxuzIpgTg1gwX6w8kUXTbQsuy9GRYbLKcfrCiQx7CdaSv6OBFxFVIN83gZ8eJgkd
0hcZQ8uLmSV0QVvmMirQp1Pc0iGNzZHPe2z0yf7Q5dIGaxphQhuAkQfcFEknOwqKsuI7aQiE
y4GZ17oUH9h+kmO99i0x+pVOzTYNme8Ruq1K+6jEeLl8Wj7I9QgywKkFx4waKI8wuhej8bni
B6e4+1W/n2Ja1RQPCLQTmdISeVrw/QIDAQABo4IBWzCCAVcwHwYDVR0jBBgwFoAUiYJnfcSd
JnAAS7RQSHzePa4Ebn0wHQYDVR0OBBYEFMiJc5mnXVEWU0VUfKPCOXzL16qBMA4GA1UdDwEB
/wQEAwIBBjASBgNVHRMBAf8ECDAGAQH/AgEAMCYGA1UdIAQfMB0wDQYLKwYBBAGyMQECAh0w
DAYKKoZIhvdMBQICBTBYBgNVHR8EUTBPME2gS6BJhkdodHRwOi8vY3JsLnVzZXJ0cnVzdC5j
b20vVVROLVVTRVJGaXJzdC1DbGllbnRBdXRoZW50aWNhdGlvbmFuZEVtYWlsLmNybDBvBggr
BgEFBQcBAQRjMGEwOAYIKwYBBQUHMAKGLGh0dHA6Ly9jcnQudXNlcnRydXN0LmNvbS9VVE5B
QUFDbGllbnRfQ0EuY3J0MCUGCCsGAQUFBzABhhlodHRwOi8vb2NzcC51c2VydHJ1c3QuY29t
MA0GCSqGSIb3DQEBBQUAA4IBAQAIF6Qc+RVrsBlhb6BI43ok70FVmqBQcNgj1VOGSmeO6NpW
FhgsGxghibqYJO5WNOMXhia9IVrUGZNK6mF0TbbEhI1H7souZW02k5ix4pJmiOJOe3XHXkRp
IdwSajD8YYD+D/Cd0tx+ruXCU00LcINhVkX7Pd9y9gJCdeuIiphXYPWfonJJsXo+QWc1w1Ur
DQc16MYr1bqHeYB055I2vUaBei4p4mq+4RcTNXwWpCUzL2HuzslS4EDb0745Ws/PTBTlf8VD
V4tNuVqOwg8GTd7ISZ2Bc7rkP5ilkzxZTtubNd/+Tb9Voelkw0V8rI98R9E68U512ZBnQOix
L54XL1w1MIIEijCCA3KgAwIBAgIQJ/TqEfR6hsRunbtuqRcHBzANBgkqhkiG9w0BAQUFADBv
MQswCQYDVQQGEwJTRTEUMBIGA1UEChMLQWRkVHJ1c3QgQUIxJjAkBgNVBAsTHUFkZFRydXN0
IEV4dGVybmFsIFRUUCBOZXR3b3JrMSIwIAYDVQQDExlBZGRUcnVzdCBFeHRlcm5hbCBDQSBS
b290MB4XDTA1MDYwNzA4MDkxMFoXDTIwMDUzMDEwNDgzOFowga4xCzAJBgNVBAYTAlVTMQsw
CQYDVQQIEwJVVDEXMBUGA1UEBxMOU2FsdCBMYWtlIENpdHkxHjAcBgNVBAoTFVRoZSBVU0VS
VFJVU1QgTmV0d29yazEhMB8GA1UECxMYaHR0cDovL3d3dy51c2VydHJ1c3QuY29tMTYwNAYD
VQQDEy1VVE4tVVNFUkZpcnN0LUNsaWVudCBBdXRoZW50aWNhdGlvbiBhbmQgRW1haWwwggEi
MA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQCyOYWk8n2rQTtiRjeuzcFgdbw5ZflKGkei
ucxIzGqY1U01GbmkQuXOSeKKLx580jEHx060g2SdLinVomTEhb2FUTV5pE5okHsceqSSqBfy
mBXyk8zJpDKVuwxPML2YoAuL5W4bokb6eLyib6tZXqUvz8rabaov66yhs2qqty5nNYt54R5p
iOLmRs2gpeq+C852OnoOm+r82idbPXMfIuZIYcZM82mxqC4bttQxICy8goqOpA6l14lD/BZa
rx1x1xFZ2rqHDa/68+HC8KTFZ4zW1lQ63gqkugN3s2XI/R7TdGKqGMpokx6hhX71R2XL+E1X
KHTSNP8wtu72YjAUjCzrAgMBAAGjgeEwgd4wHwYDVR0jBBgwFoAUrb2YejS0Jvf6xCZU7wO9
4CTLVBowHQYDVR0OBBYEFImCZ33EnSZwAEu0UEh83j2uBG59MA4GA1UdDwEB/wQEAwIBBjAP
BgNVHRMBAf8EBTADAQH/MHsGA1UdHwR0MHIwOKA2oDSGMmh0dHA6Ly9jcmwuY29tb2RvY2Eu
Y29tL0FkZFRydXN0RXh0ZXJuYWxDQVJvb3QuY3JsMDagNKAyhjBodHRwOi8vY3JsLmNvbW9k
by5uZXQvQWRkVHJ1c3RFeHRlcm5hbENBUm9vdC5jcmwwDQYJKoZIhvcNAQEFBQADggEBABnY
iRFvKKymAKLnh8GbkAPbfqES/R7z4vABqZRUQmuaCcSgbdeQkgQDZnlDcfz4b6/bdkXiNxo9
3eRZBHisHPSDRvN6z1uEci3lRsG6GBEp88tJeYc8um0FnaRtaE+tchQ2qLmx/b/Pf/CkapQ1
UI/PgW1Vsd1ZMErfbaCcZB9JfO82u/TjafT4OY9arUuFOrcO7dPPDUSi+wS/5C9wjiX7WlQG
s9DEvG2N+3MyLOmbhCQt1n+RemgCUB8OP03pzPW7Z+jcHC47/E7N/gKO46gTCqUmRGXpEPJN
Uqeu3D7KazJcQWz+9V2g6v/R+puGWG09lkfl/i6VBMIAzI6h8rswggTkMIIDzKADAgECAhBV
q8fXa8H68woq1FrbQKgGMA0GCSqGSIb3DQEBBQUAMEQxCzAJBgNVBAYTAk5MMQ8wDQYDVQQK
EwZURVJFTkExJDAiBgNVBAMTG1RFUkVOQSBlU2NpZW5jZSBQZXJzb25hbCBDQTAeFw0xMjA3
MDEwMDAwMDBaFw0xMzA3MzEyMzU5NTlaMIGIMRMwEQYKCZImiZPyLGQBGRYDb3JnMRYwFAYK
CZImiZPyLGQBGRYGdGVyZW5hMRMwEQYKCZImiZPyLGQBGRYDdGNzMQswCQYDVQQGEwJOTDEP
MA0GA1UEChMGTmlraGVmMSYwJAYDVQQDFB1NaXNjaGEgU2FsbGUgbXNhbGxlQG5pa2hlZi5u
bDCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBANLDWBypSFZKIoPAOGipqoo20SvP
PN89PffGIqg31NdClREI6RcyY+Z5bj8w37mkZkrxq2Br2pyn78RI/q08FYJn9/16Fk5sfVdn
DHe+MXJcO7KZnX3TFnfsxGJaYLlKo/RofxTH8COwVBUY0krytkoKGeZEZSIEL2BUY586dKd8
KB5M96Q1tnzkzMohj2AaS7kfWZPeoJ/hqrJgnomg6Y6uAH9A7sGVssxY5zZwtQRo4TlgELsM
wN6oxrYzXwCI+FC7hrP+DU8Y76wwiF5Lg0bEc4Xe1+IZI0sotY/SMfBsXcdHcuOEaIMmikvx
G8llwA7a2j8oJBB5UC6X4DwPre0CAwEAAaOCAYswggGHMB8GA1UdIwQYMBaAFMiJc5mnXVEW
U0VUfKPCOXzL16qBMB0GA1UdDgQWBBQ6/osBzG/372yb0nDQOVjlHjdcjzAOBgNVHQ8BAf8E
BAMCBaAwDAYDVR0TAQH/BAIwADAdBgNVHSUEFjAUBggrBgEFBQcDBAYIKwYBBQUHAwIwJgYD
VR0gBB8wHTANBgsrBgEEAbIxAQICHTAMBgoqhkiG90wFAgIFMEcGA1UdHwRAMD4wPKA6oDiG
Nmh0dHA6Ly9jcmwudGNzLnRlcmVuYS5vcmcvVEVSRU5BZVNjaWVuY2VQZXJzb25hbENBLmNy
bDB6BggrBgEFBQcBAQRuMGwwQgYIKwYBBQUHMAKGNmh0dHA6Ly9jcnQudGNzLnRlcmVuYS5v
cmcvVEVSRU5BZVNjaWVuY2VQZXJzb25hbENBLmNydDAmBggrBgEFBQcwAYYaaHR0cDovL29j
c3AudGNzLnRlcmVuYS5vcmcwGwYDVR0RBBQwEoEQbXNhbGxlQG5pa2hlZi5ubDANBgkqhkiG
9w0BAQUFAAOCAQEAQwHSq2DXm0VIHXO8SLL9EjFGFyTVYY14RrdfFuBL5kbI0vtBpAti60mY
FZNbKpzvZa+DF9OHz+OYP7BfTsmjnnjT3iShoSLn7r4yKaiyY3xPO3zK8cxq0H0HD19IgSDd
8sy88g8ibCLkIuGC126AF4tVBrsaSmg6hVipLTCaxvskQQl7HEQzAao48z1izcN++qRhJss5
508aM634xRHyjmt1/O/o+plS7TwkhEsfqeJ0p2/m3TUojTEebqq1mG6JCq+mdVy3yLsCjh+3
V5IEog/VVwFTupOSfqZL+T36hUAkcA+GnikvPmzqRtfVUPZEzrJwhhT38CQFYrctm2lUajGC
AlowggJWAgEBMFgwRDELMAkGA1UEBhMCTkwxDzANBgNVBAoTBlRFUkVOQTEkMCIGA1UEAxMb
VEVSRU5BIGVTY2llbmNlIFBlcnNvbmFsIENBAhBVq8fXa8H68woq1FrbQKgGMAkGBSsOAwIa
BQCggdgwGAYJKoZIhvcNAQkDMQsGCSqGSIb3DQEHATAcBgkqhkiG9w0BCQUxDxcNMTMwNTE3
MTMwNjI0WjAjBgkqhkiG9w0BCQQxFgQUa3HhUWKKm0MBoh0PgNQtcyn0ulYweQYJKoZIhvcN
AQkPMWwwajALBglghkgBZQMEASowCwYJYIZIAWUDBAEWMAsGCWCGSAFlAwQBAjAKBggqhkiG
9w0DBzAOBggqhkiG9w0DAgICAIAwDQYIKoZIhvcNAwICAUAwBwYFKw4DAgcwDQYIKoZIhvcN
AwICASgwDQYJKoZIhvcNAQEBBQAEggEAYIBzTrPFno34BODYfw57EqJ+E9xOh/GLRAFAavKH
QnkxyjZ+PhX0HV1Ii9zqJule7LtObNhZsp5bdtd844pjt3B1/TbLx9YokKQJkYd10BnzIp4r
r4THcMEioqVwY3wrWv0GauTsC47huVqKprLH2loVZtaqq5aHPubZCdckpogudkjR/D3kdx5q
Nh6Tufa02KU4nTbfG5f7ygIQDxzbcz734eTG3QGv7ojpO6bo5vpVYnwwnaTlYpMGyvS0VPvz
4Mn69UG4L0K9eSQj3p36oNOQydgWXVLfpvu9LJga7LvgTPit9HMSGM5tyefNPlzn0oMYpPS+
0XE7NvvGmp1+Vw==

--7AUc2qLy4jB3hD7Z--
=========================================================================
Date:         Sun, 19 May 2013 17:40:37 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         "Nilsen, Dimitri (SCC)" <[log in to unmask]>
Subject:      Re: BrokerHelper: no compatible resources
In-Reply-To:  <[log in to unmask]>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Message-ID:  <[log in to unmask]>

Hi,

the target host(cream ce) doesn't publish info about itself to WMS. Try to =
check you wms bdii, or site bdii or resource bdii at the cream ce.

Regards
Dimitri
________________________________________
From: LHC Computer Grid - Rollout [[log in to unmask]] On Behalf Of=
 Muhammad Waqar [[log in to unmask]]
Sent: Friday, May 17, 2013 1:01 PM
To: [log in to unmask]
Subject: [LCG-ROLLOUT] BrokerHelper: no compatible resources

Dear All,

Nagios core issued a ticket with the following information's:

""
CRITICAL: [Waiting->Cancelled [timeout/dropped]] 'BrokerHelper: no compatib=
le resources'. https://rocwms01.grid.sinica.edu.tw:9000/m5M_aBHJtu4xrYXJ0Ff=
USw
CRITICAL: [Waiting->Cancelled [timeout/dropped]] 'BrokerHelper: no compatib=
le resources'. https://rocwms01.grid.sinica.edu.tw:9000/m5M_aBHJtu4xrYXJ0Ff=
USw

Testing from: rocnagios
DN: /C=3DTW/O=3DAS/OU=3DGRID/CN=3DLiaw SyueYi 182693/CN=3Dproxy/CN=3Dproxy/=
CN=3Dproxy/CN=3Dproxy
VOMS FQANs: /ops/Role=3Dlcgadmin/Capability=3DNULL, /ops/ROC/Role=3DNULL/Ca=
pability=3DNULL, /ops/ROC/AsiaPacific/Role=3DNULL/Capability=3DNULL, /ops/R=
ole=3DNULL/Capability=3DNULL
glite-wms-job-status https://rocwms01.grid.sinica.edu.tw:9000/m5M_aBHJtu4xr=
YXJ0FfUSw


=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D glite=
-wms-job-status Success =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D
BOOKKEEPING INFORMATION:

Status info for the Job : https://rocwms01.grid.sinica.edu.tw:9000/m5M_aBHJ=
tu4xrYXJ0FfUSw
Current Status: Waiting
Status Reason: BrokerHelper: no compatible resources
Submitted: Fri May 17 08:52:19 2013 UTC""

Any idea what is the reason and how to fix this problem?

Regards
M.Waqar
Pk-CIIT
=========================================================================
Date:         Mon, 20 May 2013 14:33:22 +0500
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Ali Zahir <[log in to unmask]>
Subject:      Job submission failed.
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes"; format="flowed"
Content-Disposition: inline
Content-Transfer-Encoding: 7bit
Message-ID:  <[log in to unmask]>

Dear All

Em having following error in CREAM CE:

Job submission failed.
2013-05-20 08:49:16,406 FATAL - Problems with proxyfile  
[/etc/nagios/globus/userproxy.pem--ops-Role_lcgadmin]: The proxy has  
EXPIRED!

https://rocnagios.grid.sinica.edu.tw/nagios/cgi-bin/extinfo.cgi?type=2&host=cern-56-24-244.comsats.edu.pk&service=org.sam.CREAMCE-DirectJobState-%2Fops%2FRole%3Dlcgadmin


can any one pl tell me which proxy it is?? as the vobox proxy is fine  
and up to date and how to resolve this problem.?

cheers,
Ali
PK-CIIT
=========================================================================
Date:         Mon, 20 May 2013 14:23:03 +0300
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Anatoliy Khmelevskiy <[log in to unmask]>
Subject:      EMI-UI on Ubuntu
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary=047d7bdc0e1e8dbd3704dd248f8d
Message-ID:  <[log in to unmask]>

--047d7bdc0e1e8dbd3704dd248f8d
Content-Type: text/plain; charset=ISO-8859-1

Hi!

Is it possible to install the emi-ui on ubuntu? If possible, how to do it?

Regards
Anatol.
-- 
*Khmialeuski Anatol*,
NGI_BY *Operations Deputy Manager*,
BY-JIPNR-Sosny *Site Administrator*,
*State Scientific Institution *
"Joint Institute for Power and Nuclear Research - Sosny"
*99 Academician A.K.Krasin Str., Minsk BY-220109, Belarus*

--047d7bdc0e1e8dbd3704dd248f8d
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><span style=3D"color:rgb(0,0,0);font-family:arial,sans-ser=
if;font-size:13px">Hi!=A0</span><div><font color=3D"#000000" face=3D"arial,=
 sans-serif"><br></font></div><div><font color=3D"#000000" face=3D"arial, s=
ans-serif">Is it possible=A0to install=A0the<span style=3D"background-color=
:rgb(243,243,243)">=A0</span></font><font color=3D"#000000"><span style=3D"=
background-color:rgb(243,243,243)"><span class=3D"" style=3D"font-family:ar=
ial,sans-serif;font-size:13px">emi</span><span style=3D"font-family:arial,s=
ans-serif;font-size:13px">-</span><span class=3D"" style=3D"font-family:ari=
al,sans-serif;font-size:13px">ui</span></span><span style=3D"font-family:ar=
ial,sans-serif;font-size:13px">=A0</span></font><span style=3D"color:rgb(0,=
0,0);font-family:arial,sans-serif;font-size:13px">on ubuntu?=A0</span><font=
 color=3D"#000000" face=3D"arial, sans-serif">If possible, how to do it?</f=
ont><br clear=3D"all">

<div><span style=3D"color:rgb(0,0,0);font-family:arial,sans-serif;font-size=
:13px"><br></span></div><div><span style=3D"color:rgb(0,0,0);font-family:ar=
ial,sans-serif;font-size:13px">Regards</span><br></div><div style><font col=
or=3D"#000000" face=3D"arial, sans-serif">Anatol.</font></div>

-- <br><div dir=3D"ltr"><b>Khmialeuski Anatol</b>,<div>NGI_BY=A0<i style=3D=
"color:rgb(0,0,0);font-family:sans-serif;line-height:19.1875px">Operations =
Deputy Manager</i>,<br><div>BY-JIPNR-Sosny <i>Site Administrator</i>,</div>=
<i>State Scientific Institution </i><br>

&quot;Joint Institute for Power and Nuclear Research - Sosny&quot;<br><i>99=
 Academician A.K.Krasin Str., Minsk BY-220109, Belarus</i><br></div></div>
</div></div>

--047d7bdc0e1e8dbd3704dd248f8d--
=========================================================================
Date:         Mon, 20 May 2013 15:53:28 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Maarten Litmaath <[log in to unmask]>
Subject:      Re: Job submission failed.
Comments: To: Ali Zahir <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Message-ID:  <[log in to unmask]>

Hi Ali,

> Em having following error in CREAM CE:
> 
> Job submission failed.
> 2013-05-20 08:49:16,406 FATAL - Problems with proxyfile
> [/etc/nagios/globus/userproxy.pem--ops-Role_lcgadmin]: The proxy has EXPIRED!
> 
> https://rocnagios.grid.sinica.edu.tw/nagios/cgi-bin/extinfo.cgi?type=2&host=cern-56-24-244.comsats.edu.pk&service=org.sam.CREAMCE-DirectJobState-%2Fops%2FRole%3Dlcgadmin
> 
> 
> can any one pl tell me which proxy it is?? as the vobox proxy is fine and up
> to date and how to resolve this problem.?

It is a problem on the Nagios machine.  The only thing you could do
is to open a GGUS ticket against ROC-Asia/Pacific, but they should
soon find out themselves...
=========================================================================
Date:         Mon, 20 May 2013 19:27:31 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Maarten Litmaath <[log in to unmask]>
Subject:      Re: EMI-UI on Ubuntu
Comments: To: Anatoliy Khmelevskiy <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Message-ID:  <[log in to unmask]>

Hi Anatoliy,

> Is it possible to install the emi-ui on ubuntu?

Ubuntu (i.e. Debian) currently is not supported for the UI:

http://www.eu-emi.eu/releases/emi-3-montebianco/products/-/asset_publisher/5dKm/content/emi-ui-2
=========================================================================
Date:         Wed, 22 May 2013 10:50:44 +0100
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         =?ISO-8859-1?Q?Gon=E7alo_Borges?= <[log in to unmask]>
Organization: LIP
Subject:      Re: Authorization errorglexec error: [gLExec]: LCAS failed
Comments: To: Felix Farcas <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: multipart/signed; protocol="application/pkcs7-signature";
              micalg=sha1; boundary="------------ms080308050800030805080900"
Message-ID:  <[log in to unmask]>

This is a cryptographically signed message in MIME format.

--------------ms080308050800030805080900
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: quoted-printable

Hi Felix...

I think you have two problems...

1./ First, let us go to the Argus / Authentication problem.

The "argus-pap: unrecognized service" message was detected in SR and=20
GGUS #93508 was opened. However, this seems an harmless message since it =

does not prevent the service from working properly.

The message

    2013-05-17 12:57:21,939 FATAL - Authorization failure: Authorization =

error:
    Failed to get the local user id via glexec:  glexec error: [gLExec]: =

LCAS
    failed. The reason can be found in the syslog.

often happens after a reboot of the Argus or similar, and normally it=20
goes away with a proper restart of Argus daemons. Please take attention=20
that some of the daemon may be stuck in some incoherent state, and a=20
simple stop may not work.

/etc/rc5.d/S97argus-pepd stop
/etc/rc5.d/S97argus-pdp stop
/etc/rc5.d/S97argus-pap stop
/etc/rc5.d/S97argus-pepd status
/etc/rc5.d/S97argus-pdp status
/etc/rc5.d/S97argus-pap status

/etc/rc5.d/S97argus-pap start
/etc/rc5.d/S97argus-pap status
/etc/rc5.d/S97argus-pdp start
/etc/rc5.d/S97argus-pap status
/etc/rc5.d/S97argus-pepd start
/etc/rc5.d/S97argus-pepd status

2) The second error has nothing to do with ARGUS.

     ERROR:lcg-info-dynamic-scheduler:Execution error: Missing LRMS=20
backend command in configuration

We can tackle this after 1) is solved.

Cheers
Goncalo

On 05/17/2013 01:51 PM, Felix Farcas wrote:
> On 5/17/2013 3:21 PM, Massimo Sgaravatto wrote:
>> On 05/17/2013 02:05 PM, Maarten Litmaath wrote:
>>> Hi Felix,
>>>
>>>> 2013-05-17 12:57:21,939 FATAL - Authorization failure:=20
>>>> Authorization error:
>>>> Failed to get the local user id via glexec:  glexec error:=20
>>>> [gLExec]: LCAS
>>>> failed. The reason can be found in the syslog.
>>>>
>>>> Where may I look?
>>>
>>> As the message says: did you check the syslog, usually=20
>>> /var/log/messages?
>>>
>>
>> You might need to increase the verbosity levels in /etc/glexec.conf
>>
>> Cheers, Massimo
>>
>
> In BDII log I have the following:
>
> argus-pap: unrecognized service
> ERROR:lcg-info-dynamic-scheduler:Execution error: Missing LRMS backend =

> command in configuration
> argus-pap: unrecognized service
> argus-pap: unrecognized service
> argus-pap: unrecognized service
> argus-pap: unrecognized service
> argus-pap: unrecognized service
> argus-pap: unrecognized service
> argus-pap: unrecognized service
> argus-pap: unrecognized service
> argus-pap: unrecognized service
> argus-pap: unrecognized service
> argus-pap: unrecognized service
>
> when try to restart argus-pap it say nothing, at status report if=20
> gives me the following message
>
> /etc/init.d/argus-pap status
> PAP running!
>
> REgarding the verbosity level may I modify which line from the config=20
> file?
>
> [glexec]
> linger =3D no
>
> lcmaps_db_file =3D /etc/lcmaps/lcmaps-glexec.db
> lcmaps_log_file =3D /var/log/glexec/lcas_lcmaps.log
> lcmaps_debug_level =3D 0
> lcmaps_log_level =3D 1
>
> lcas_db_file =3D /etc/lcas/lcas-glexec.db
> lcas_log_file =3D /var/log/glexec/lcas_lcmaps.log
> lcas_debug_level =3D 0
> lcas_log_level =3D 1
>
> log_level =3D 1
> user_identity_switch_by =3D lcmaps
> user_white_list =3D tomcat
> omission_private_key_white_list  =3D tomcat
> preserve_env_variables =3D
> create_target_proxy =3D no
> silent_logging =3D no
> log_destination =3D syslog
> log_file =3D /var/log/glexec/glexec.log
>
> Thank you
> Felix
>



--------------ms080308050800030805080900
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature

MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIID0jCC
A84wggK2oAMCAQICAga7MA0GCSqGSIb3DQEBBQUAMEMxCzAJBgNVBAYTAlBUMQ4wDAYDVQQK
EwVMSVBDQTEkMCIGA1UEAxMbTElQIENlcnRpZmljYXRpb24gQXV0aG9yaXR5MB4XDTEzMDQy
MjEzMzQxNVoXDTE0MDQyMjEzMzQxNVowVTELMAkGA1UEBhMCUFQxDjAMBgNVBAoTBUxJUENB
MQwwCgYDVQQKEwNMSVAxDzANBgNVBAsTBkxpc2JvYTEXMBUGA1UEAxMOR29uY2FsbyBCb3Jn
ZXMwgZ8wDQYJKoZIhvcNAQEBBQADgY0AMIGJAoGBAMO8jZDnePKsdv9ZSe6O94U0Od/CuDzW
iNCcRng5mApXvacIhjbRPPXCfzSTPkebJR7eVmnE68xG+Sajb+0RRXD42YV8qlxth92gyC2y
veqr4+JZTCgYz5v6/PheW+PRND978YCeCsmjxvmF/wLoFs4ZakLC18MOQ7p6l9/HTN+hAgMB
AAGjggE8MIIBODAMBgNVHRMBAf8EAjAAMBEGCWCGSAGG+EIBAQQEAwIFoDAOBgNVHQ8BAf8E
BAMCBLAwHQYDVR0OBBYEFHqOZ0i7Wz1mE8ZKFEyfe0xmXMXRMGsGA1UdIwRkMGKAFEKubveG
Hp7oaO/PeVM4Yk4A8kLsoUekRTBDMQswCQYDVQQGEwJQVDEOMAwGA1UEChMFTElQQ0ExJDAi
BgNVBAMTG0xJUCBDZXJ0aWZpY2F0aW9uIEF1dGhvcml0eYIBADAZBgNVHREEEjAQgQ5nb25j
YWxvQGxpcC5wdDAUBgNVHRIEDTALgQljYUBsaXAucHQwGQYDVR0gBBIwEDAOBgwrBgEEAcx2
CgEBBQEwLQYDVR0fBCYwJDAioCCgHoYcaHR0cDovL2NhLmxpcC5wdC9jcmwvY3JsLmRlcjAN
BgkqhkiG9w0BAQUFAAOCAQEALJdAJVqQDP78hSA8F4mJ9GT0cOjpt6WCU2d7ZDv9wVwDnGPu
YupyAWQo3Oc0GmUAI8XvOmPmM7hBVsdVY5e9WJNiCJEsK5fokZAt1GJCZpo+AKICn15UxfyF
+xmvcl+qsQbJ7zgA/7GP6bCCSpT9JpGd4VQj4ctV/SSJMN6UBgfKaY283Xu5YqFqg6QR70M+
Dsu2DKxPHh2K++L+nI0Q6qQ1dfZwjM/FlxqtMQE3Dpg/YPYvRC/wQvKAMdN3yiBYhuP4F8/O
lbOOP8shAfZC5oCpsvtfxut6NH0TO8YhuJVPDr8S6tM1bYPwXKkffq9Wmp47othz5j/4EKOz
b/NdATGCAnQwggJwAgEBMEkwQzELMAkGA1UEBhMCUFQxDjAMBgNVBAoTBUxJUENBMSQwIgYD
VQQDExtMSVAgQ2VydGlmaWNhdGlvbiBBdXRob3JpdHkCAga7MAkGBSsOAwIaBQCgggGBMBgG
CSqGSIb3DQEJAzELBgkqhkiG9w0BBwEwHAYJKoZIhvcNAQkFMQ8XDTEzMDUyMjA5NTA0NFow
IwYJKoZIhvcNAQkEMRYEFJU88quEVlfn7Qo4LtdexkjfG88cMFgGCSsGAQQBgjcQBDFLMEkw
QzELMAkGA1UEBhMCUFQxDjAMBgNVBAoTBUxJUENBMSQwIgYDVQQDExtMSVAgQ2VydGlmaWNh
dGlvbiBBdXRob3JpdHkCAga7MFoGCyqGSIb3DQEJEAILMUugSTBDMQswCQYDVQQGEwJQVDEO
MAwGA1UEChMFTElQQ0ExJDAiBgNVBAMTG0xJUCBDZXJ0aWZpY2F0aW9uIEF1dGhvcml0eQIC
BrswbAYJKoZIhvcNAQkPMV8wXTALBglghkgBZQMEASowCwYJYIZIAWUDBAECMAoGCCqGSIb3
DQMHMA4GCCqGSIb3DQMCAgIAgDANBggqhkiG9w0DAgIBQDAHBgUrDgMCBzANBggqhkiG9w0D
AgIBKDANBgkqhkiG9w0BAQEFAASBgBkEWq8rkND79i6lEoGmZPAmO/qu01GyUMz9e2FkdwwL
Ztp/4sAJ6jbJE/d+pdxPlPr5Twsyt03v48yOxNGjUChC1Z1zat/E7qzCb42PiGFINcT5x7ir
gc31Jz4eis/Dc0PJQODeDyVCvooCE/G2F/fzQXRPsMFLf8EDKr+vXyOaAAAAAAAA
--------------ms080308050800030805080900--
=========================================================================
Date:         Wed, 22 May 2013 13:27:52 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Mischa Salle <[log in to unmask]>
Subject:      Re: Authorization errorglexec error: [gLExec]: LCAS failed
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: multipart/signed; protocol="application/x-pkcs7-signature";
              micalg=sha1; boundary="ReaqsoxgOBHFXBhH"
Content-Disposition: inline
Message-ID:  <[log in to unmask]>

--ReaqsoxgOBHFXBhH
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hi Felix, Gon=E7alo,

as I mentioned before (last week) LCAS has no interaction with Argus, so
when gLExec says that LCAS has failed, this also has NOTHING to do with
Argus. LCMAPS is the component that can do a calllout to Argus.
To debug the problem, increase the LCAS logging in the
glexec.conf by setting lcas_debug_level to the highest level 5.

    Cheers,
    Mischa

On Wed, May 22, 2013 at 10:50:44AM +0100, Gon=E7alo Borges wrote:
> Hi Felix...
>=20
> I think you have two problems...
>=20
> 1./ First, let us go to the Argus / Authentication problem.
>=20
> The "argus-pap: unrecognized service" message was detected in SR and
> GGUS #93508 was opened. However, this seems an harmless message
> since it does not prevent the service from working properly.
>=20
> The message
>=20
>    2013-05-17 12:57:21,939 FATAL - Authorization failure:
> Authorization error:
>    Failed to get the local user id via glexec:  glexec error:
> [gLExec]: LCAS
>    failed. The reason can be found in the syslog.
>=20
> often happens after a reboot of the Argus or similar, and normally
> it goes away with a proper restart of Argus daemons. Please take
> attention that some of the daemon may be stuck in some incoherent
> state, and a simple stop may not work.
>=20
> /etc/rc5.d/S97argus-pepd stop
> /etc/rc5.d/S97argus-pdp stop
> /etc/rc5.d/S97argus-pap stop
> /etc/rc5.d/S97argus-pepd status
> /etc/rc5.d/S97argus-pdp status
> /etc/rc5.d/S97argus-pap status
>=20
> /etc/rc5.d/S97argus-pap start
> /etc/rc5.d/S97argus-pap status
> /etc/rc5.d/S97argus-pdp start
> /etc/rc5.d/S97argus-pap status
> /etc/rc5.d/S97argus-pepd start
> /etc/rc5.d/S97argus-pepd status
>=20
> 2) The second error has nothing to do with ARGUS.
>=20
>     ERROR:lcg-info-dynamic-scheduler:Execution error: Missing LRMS
> backend command in configuration
>=20
> We can tackle this after 1) is solved.
>=20
> Cheers
> Goncalo
>=20
> On 05/17/2013 01:51 PM, Felix Farcas wrote:
> >On 5/17/2013 3:21 PM, Massimo Sgaravatto wrote:
> >>On 05/17/2013 02:05 PM, Maarten Litmaath wrote:
> >>>Hi Felix,
> >>>
> >>>>2013-05-17 12:57:21,939 FATAL - Authorization failure:
> >>>>Authorization error:
> >>>>Failed to get the local user id via glexec:  glexec error:
> >>>>[gLExec]: LCAS
> >>>>failed. The reason can be found in the syslog.
> >>>>
> >>>>Where may I look?
> >>>
> >>>As the message says: did you check the syslog, usually
> >>>/var/log/messages?
> >>>
> >>
> >>You might need to increase the verbosity levels in /etc/glexec.conf
> >>
> >>Cheers, Massimo
> >>
> >
> >In BDII log I have the following:
> >
> >argus-pap: unrecognized service
> >ERROR:lcg-info-dynamic-scheduler:Execution error: Missing LRMS
> >backend command in configuration
> >argus-pap: unrecognized service
> >argus-pap: unrecognized service
> >argus-pap: unrecognized service
> >argus-pap: unrecognized service
> >argus-pap: unrecognized service
> >argus-pap: unrecognized service
> >argus-pap: unrecognized service
> >argus-pap: unrecognized service
> >argus-pap: unrecognized service
> >argus-pap: unrecognized service
> >argus-pap: unrecognized service
> >
> >when try to restart argus-pap it say nothing, at status report if
> >gives me the following message
> >
> >/etc/init.d/argus-pap status
> >PAP running!
> >
> >REgarding the verbosity level may I modify which line from the
> >config file?
> >
> >[glexec]
> >linger =3D no
> >
> >lcmaps_db_file =3D /etc/lcmaps/lcmaps-glexec.db
> >lcmaps_log_file =3D /var/log/glexec/lcas_lcmaps.log
> >lcmaps_debug_level =3D 0
> >lcmaps_log_level =3D 1
> >
> >lcas_db_file =3D /etc/lcas/lcas-glexec.db
> >lcas_log_file =3D /var/log/glexec/lcas_lcmaps.log
> >lcas_debug_level =3D 0
> >lcas_log_level =3D 1
> >
> >log_level =3D 1
> >user_identity_switch_by =3D lcmaps
> >user_white_list =3D tomcat
> >omission_private_key_white_list  =3D tomcat
> >preserve_env_variables =3D
> >create_target_proxy =3D no
> >silent_logging =3D no
> >log_destination =3D syslog
> >log_file =3D /var/log/glexec/glexec.log
> >
> >Thank you
> >Felix
> >
>=20
>=20



--=20
Nikhef                      Room  H155
Science Park 105            Tel.  +31-20-592 5102
1098 XG Amsterdam           Fax   +31-20-592 5155
The Netherlands             Email [log in to unmask]
  __ .. ... _._. .... ._  ... ._ ._.. ._.. .._..

--ReaqsoxgOBHFXBhH
Content-Type: application/x-pkcs7-signature
Content-Disposition: attachment; filename="smime.p7s"
Content-Transfer-Encoding: base64

MIIQ5gYJKoZIhvcNAQcCoIIQ1zCCENMCAQExCzAJBgUrDgMCGgUAMAsGCSqGSIb3DQEHAaCC
DlQwggTaMIIDwqADAgECAhBd/1Dq/g9TRoifgEGP50LIMA0GCSqGSIb3DQEBBQUAMIGuMQsw
CQYDVQQGEwJVUzELMAkGA1UECBMCVVQxFzAVBgNVBAcTDlNhbHQgTGFrZSBDaXR5MR4wHAYD
VQQKExVUaGUgVVNFUlRSVVNUIE5ldHdvcmsxITAfBgNVBAsTGGh0dHA6Ly93d3cudXNlcnRy
dXN0LmNvbTE2MDQGA1UEAxMtVVROLVVTRVJGaXJzdC1DbGllbnQgQXV0aGVudGljYXRpb24g
YW5kIEVtYWlsMB4XDTA5MDUxODAwMDAwMFoXDTI4MTIzMTIzNTk1OVowRDELMAkGA1UEBhMC
TkwxDzANBgNVBAoTBlRFUkVOQTEkMCIGA1UEAxMbVEVSRU5BIGVTY2llbmNlIFBlcnNvbmFs
IENBMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAxLwlfc3vWQSrdCcptmOK78Q1
cKFGCK9uUJsbcXiXAoXANf+QGBFm+jM5FJ6kEvc97gUsgef3LS42eEMOObZ/enTDy1U/L7PL
wzdq4ZUDIxuzIpgTg1gwX6w8kUXTbQsuy9GRYbLKcfrCiQx7CdaSv6OBFxFVIN83gZ8eJgkd
0hcZQ8uLmSV0QVvmMirQp1Pc0iGNzZHPe2z0yf7Q5dIGaxphQhuAkQfcFEknOwqKsuI7aQiE
y4GZ17oUH9h+kmO99i0x+pVOzTYNme8Ruq1K+6jEeLl8Wj7I9QgywKkFx4waKI8wuhej8bni
B6e4+1W/n2Ja1RQPCLQTmdISeVrw/QIDAQABo4IBWzCCAVcwHwYDVR0jBBgwFoAUiYJnfcSd
JnAAS7RQSHzePa4Ebn0wHQYDVR0OBBYEFMiJc5mnXVEWU0VUfKPCOXzL16qBMA4GA1UdDwEB
/wQEAwIBBjASBgNVHRMBAf8ECDAGAQH/AgEAMCYGA1UdIAQfMB0wDQYLKwYBBAGyMQECAh0w
DAYKKoZIhvdMBQICBTBYBgNVHR8EUTBPME2gS6BJhkdodHRwOi8vY3JsLnVzZXJ0cnVzdC5j
b20vVVROLVVTRVJGaXJzdC1DbGllbnRBdXRoZW50aWNhdGlvbmFuZEVtYWlsLmNybDBvBggr
BgEFBQcBAQRjMGEwOAYIKwYBBQUHMAKGLGh0dHA6Ly9jcnQudXNlcnRydXN0LmNvbS9VVE5B
QUFDbGllbnRfQ0EuY3J0MCUGCCsGAQUFBzABhhlodHRwOi8vb2NzcC51c2VydHJ1c3QuY29t
MA0GCSqGSIb3DQEBBQUAA4IBAQAIF6Qc+RVrsBlhb6BI43ok70FVmqBQcNgj1VOGSmeO6NpW
FhgsGxghibqYJO5WNOMXhia9IVrUGZNK6mF0TbbEhI1H7souZW02k5ix4pJmiOJOe3XHXkRp
IdwSajD8YYD+D/Cd0tx+ruXCU00LcINhVkX7Pd9y9gJCdeuIiphXYPWfonJJsXo+QWc1w1Ur
DQc16MYr1bqHeYB055I2vUaBei4p4mq+4RcTNXwWpCUzL2HuzslS4EDb0745Ws/PTBTlf8VD
V4tNuVqOwg8GTd7ISZ2Bc7rkP5ilkzxZTtubNd/+Tb9Voelkw0V8rI98R9E68U512ZBnQOix
L54XL1w1MIIEijCCA3KgAwIBAgIQJ/TqEfR6hsRunbtuqRcHBzANBgkqhkiG9w0BAQUFADBv
MQswCQYDVQQGEwJTRTEUMBIGA1UEChMLQWRkVHJ1c3QgQUIxJjAkBgNVBAsTHUFkZFRydXN0
IEV4dGVybmFsIFRUUCBOZXR3b3JrMSIwIAYDVQQDExlBZGRUcnVzdCBFeHRlcm5hbCBDQSBS
b290MB4XDTA1MDYwNzA4MDkxMFoXDTIwMDUzMDEwNDgzOFowga4xCzAJBgNVBAYTAlVTMQsw
CQYDVQQIEwJVVDEXMBUGA1UEBxMOU2FsdCBMYWtlIENpdHkxHjAcBgNVBAoTFVRoZSBVU0VS
VFJVU1QgTmV0d29yazEhMB8GA1UECxMYaHR0cDovL3d3dy51c2VydHJ1c3QuY29tMTYwNAYD
VQQDEy1VVE4tVVNFUkZpcnN0LUNsaWVudCBBdXRoZW50aWNhdGlvbiBhbmQgRW1haWwwggEi
MA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQCyOYWk8n2rQTtiRjeuzcFgdbw5ZflKGkei
ucxIzGqY1U01GbmkQuXOSeKKLx580jEHx060g2SdLinVomTEhb2FUTV5pE5okHsceqSSqBfy
mBXyk8zJpDKVuwxPML2YoAuL5W4bokb6eLyib6tZXqUvz8rabaov66yhs2qqty5nNYt54R5p
iOLmRs2gpeq+C852OnoOm+r82idbPXMfIuZIYcZM82mxqC4bttQxICy8goqOpA6l14lD/BZa
rx1x1xFZ2rqHDa/68+HC8KTFZ4zW1lQ63gqkugN3s2XI/R7TdGKqGMpokx6hhX71R2XL+E1X
KHTSNP8wtu72YjAUjCzrAgMBAAGjgeEwgd4wHwYDVR0jBBgwFoAUrb2YejS0Jvf6xCZU7wO9
4CTLVBowHQYDVR0OBBYEFImCZ33EnSZwAEu0UEh83j2uBG59MA4GA1UdDwEB/wQEAwIBBjAP
BgNVHRMBAf8EBTADAQH/MHsGA1UdHwR0MHIwOKA2oDSGMmh0dHA6Ly9jcmwuY29tb2RvY2Eu
Y29tL0FkZFRydXN0RXh0ZXJuYWxDQVJvb3QuY3JsMDagNKAyhjBodHRwOi8vY3JsLmNvbW9k
by5uZXQvQWRkVHJ1c3RFeHRlcm5hbENBUm9vdC5jcmwwDQYJKoZIhvcNAQEFBQADggEBABnY
iRFvKKymAKLnh8GbkAPbfqES/R7z4vABqZRUQmuaCcSgbdeQkgQDZnlDcfz4b6/bdkXiNxo9
3eRZBHisHPSDRvN6z1uEci3lRsG6GBEp88tJeYc8um0FnaRtaE+tchQ2qLmx/b/Pf/CkapQ1
UI/PgW1Vsd1ZMErfbaCcZB9JfO82u/TjafT4OY9arUuFOrcO7dPPDUSi+wS/5C9wjiX7WlQG
s9DEvG2N+3MyLOmbhCQt1n+RemgCUB8OP03pzPW7Z+jcHC47/E7N/gKO46gTCqUmRGXpEPJN
Uqeu3D7KazJcQWz+9V2g6v/R+puGWG09lkfl/i6VBMIAzI6h8rswggTkMIIDzKADAgECAhBV
q8fXa8H68woq1FrbQKgGMA0GCSqGSIb3DQEBBQUAMEQxCzAJBgNVBAYTAk5MMQ8wDQYDVQQK
EwZURVJFTkExJDAiBgNVBAMTG1RFUkVOQSBlU2NpZW5jZSBQZXJzb25hbCBDQTAeFw0xMjA3
MDEwMDAwMDBaFw0xMzA3MzEyMzU5NTlaMIGIMRMwEQYKCZImiZPyLGQBGRYDb3JnMRYwFAYK
CZImiZPyLGQBGRYGdGVyZW5hMRMwEQYKCZImiZPyLGQBGRYDdGNzMQswCQYDVQQGEwJOTDEP
MA0GA1UEChMGTmlraGVmMSYwJAYDVQQDFB1NaXNjaGEgU2FsbGUgbXNhbGxlQG5pa2hlZi5u
bDCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBANLDWBypSFZKIoPAOGipqoo20SvP
PN89PffGIqg31NdClREI6RcyY+Z5bj8w37mkZkrxq2Br2pyn78RI/q08FYJn9/16Fk5sfVdn
DHe+MXJcO7KZnX3TFnfsxGJaYLlKo/RofxTH8COwVBUY0krytkoKGeZEZSIEL2BUY586dKd8
KB5M96Q1tnzkzMohj2AaS7kfWZPeoJ/hqrJgnomg6Y6uAH9A7sGVssxY5zZwtQRo4TlgELsM
wN6oxrYzXwCI+FC7hrP+DU8Y76wwiF5Lg0bEc4Xe1+IZI0sotY/SMfBsXcdHcuOEaIMmikvx
G8llwA7a2j8oJBB5UC6X4DwPre0CAwEAAaOCAYswggGHMB8GA1UdIwQYMBaAFMiJc5mnXVEW
U0VUfKPCOXzL16qBMB0GA1UdDgQWBBQ6/osBzG/372yb0nDQOVjlHjdcjzAOBgNVHQ8BAf8E
BAMCBaAwDAYDVR0TAQH/BAIwADAdBgNVHSUEFjAUBggrBgEFBQcDBAYIKwYBBQUHAwIwJgYD
VR0gBB8wHTANBgsrBgEEAbIxAQICHTAMBgoqhkiG90wFAgIFMEcGA1UdHwRAMD4wPKA6oDiG
Nmh0dHA6Ly9jcmwudGNzLnRlcmVuYS5vcmcvVEVSRU5BZVNjaWVuY2VQZXJzb25hbENBLmNy
bDB6BggrBgEFBQcBAQRuMGwwQgYIKwYBBQUHMAKGNmh0dHA6Ly9jcnQudGNzLnRlcmVuYS5v
cmcvVEVSRU5BZVNjaWVuY2VQZXJzb25hbENBLmNydDAmBggrBgEFBQcwAYYaaHR0cDovL29j
c3AudGNzLnRlcmVuYS5vcmcwGwYDVR0RBBQwEoEQbXNhbGxlQG5pa2hlZi5ubDANBgkqhkiG
9w0BAQUFAAOCAQEAQwHSq2DXm0VIHXO8SLL9EjFGFyTVYY14RrdfFuBL5kbI0vtBpAti60mY
FZNbKpzvZa+DF9OHz+OYP7BfTsmjnnjT3iShoSLn7r4yKaiyY3xPO3zK8cxq0H0HD19IgSDd
8sy88g8ibCLkIuGC126AF4tVBrsaSmg6hVipLTCaxvskQQl7HEQzAao48z1izcN++qRhJss5
508aM634xRHyjmt1/O/o+plS7TwkhEsfqeJ0p2/m3TUojTEebqq1mG6JCq+mdVy3yLsCjh+3
V5IEog/VVwFTupOSfqZL+T36hUAkcA+GnikvPmzqRtfVUPZEzrJwhhT38CQFYrctm2lUajGC
AlowggJWAgEBMFgwRDELMAkGA1UEBhMCTkwxDzANBgNVBAoTBlRFUkVOQTEkMCIGA1UEAxMb
VEVSRU5BIGVTY2llbmNlIFBlcnNvbmFsIENBAhBVq8fXa8H68woq1FrbQKgGMAkGBSsOAwIa
BQCggdgwGAYJKoZIhvcNAQkDMQsGCSqGSIb3DQEHATAcBgkqhkiG9w0BCQUxDxcNMTMwNTIy
MTEyNzUyWjAjBgkqhkiG9w0BCQQxFgQUjcuNmHQNW2g8wtb9fK3FM3kmdDkweQYJKoZIhvcN
AQkPMWwwajALBglghkgBZQMEASowCwYJYIZIAWUDBAEWMAsGCWCGSAFlAwQBAjAKBggqhkiG
9w0DBzAOBggqhkiG9w0DAgICAIAwDQYIKoZIhvcNAwICAUAwBwYFKw4DAgcwDQYIKoZIhvcN
AwICASgwDQYJKoZIhvcNAQEBBQAEggEAGOv5gmrt1oLDWcuo/V/Ktzwcs5vI4gwNbXuKYJwS
xH5GaYEEKR/MIrn30SDHxdhV261QbKm9hWJP9I8leEXFAqAopqX0WMHg3HR0T9q6Hisn6g/7
rReGzNLcTMxs304D7yXh0mUPybS1GNLdCD9COZK7hu5FExqsRFIVBIZmlaAvKFU7MbSlo6uI
erp3QDBAcpymT0G+ojp98JHRviHPlYSS38ssqO9G5WZOFwETJ/tF9wNAEC/jMyZ6po/S5JHI
/6XcpIhPmkzTPXMhlxWespOhY5lpNlt83Y7aD9dxnDg+3VIeaFH7UCpSCsi5ss16/TjzO5y4
WaVMAB2LKOwJmA==

--ReaqsoxgOBHFXBhH--
=========================================================================
Date:         Wed, 22 May 2013 12:47:43 +0100
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         =?ISO-8859-1?Q?Gon=E7alo_Borges?= <[log in to unmask]>
Organization: LIP
Subject:      Re: Authorization errorglexec error: [gLExec]: LCAS failed
Comments: To: Mischa Salle <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: multipart/signed; protocol="application/pkcs7-signature";
              micalg=sha1; boundary="------------ms040800090003090609030102"
Message-ID:  <[log in to unmask]>

This is a cryptographically signed message in MIME format.

--------------ms040800090003090609030102
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: quoted-printable

I'm sorry Mischa. Probably you are right...

I messed up because the messages from gLExec are very similar in the=20
LCAS and LCMAPS cases

However, my recipe does apply for the LCMAPS errors.

Sorry
Cheers
Goncalo

On 05/22/2013 12:27 PM, Mischa Salle wrote:
> Hi Felix, Gon=E7alo,
>
> as I mentioned before (last week) LCAS has no interaction with Argus, s=
o
> when gLExec says that LCAS has failed, this also has NOTHING to do with=

> Argus. LCMAPS is the component that can do a calllout to Argus.
> To debug the problem, increase the LCAS logging in the
> glexec.conf by setting lcas_debug_level to the highest level 5.
>
>      Cheers,
>      Mischa
>
> On Wed, May 22, 2013 at 10:50:44AM +0100, Gon=E7alo Borges wrote:
>> Hi Felix...
>>
>> I think you have two problems...
>>
>> 1./ First, let us go to the Argus / Authentication problem.
>>
>> The "argus-pap: unrecognized service" message was detected in SR and
>> GGUS #93508 was opened. However, this seems an harmless message
>> since it does not prevent the service from working properly.
>>
>> The message
>>
>>     2013-05-17 12:57:21,939 FATAL - Authorization failure:
>> Authorization error:
>>     Failed to get the local user id via glexec:  glexec error:
>> [gLExec]: LCAS
>>     failed. The reason can be found in the syslog.
>>
>> often happens after a reboot of the Argus or similar, and normally
>> it goes away with a proper restart of Argus daemons. Please take
>> attention that some of the daemon may be stuck in some incoherent
>> state, and a simple stop may not work.
>>
>> /etc/rc5.d/S97argus-pepd stop
>> /etc/rc5.d/S97argus-pdp stop
>> /etc/rc5.d/S97argus-pap stop
>> /etc/rc5.d/S97argus-pepd status
>> /etc/rc5.d/S97argus-pdp status
>> /etc/rc5.d/S97argus-pap status
>>
>> /etc/rc5.d/S97argus-pap start
>> /etc/rc5.d/S97argus-pap status
>> /etc/rc5.d/S97argus-pdp start
>> /etc/rc5.d/S97argus-pap status
>> /etc/rc5.d/S97argus-pepd start
>> /etc/rc5.d/S97argus-pepd status
>>
>> 2) The second error has nothing to do with ARGUS.
>>
>>      ERROR:lcg-info-dynamic-scheduler:Execution error: Missing LRMS
>> backend command in configuration
>>
>> We can tackle this after 1) is solved.
>>
>> Cheers
>> Goncalo
>>
>> On 05/17/2013 01:51 PM, Felix Farcas wrote:
>>> On 5/17/2013 3:21 PM, Massimo Sgaravatto wrote:
>>>> On 05/17/2013 02:05 PM, Maarten Litmaath wrote:
>>>>> Hi Felix,
>>>>>
>>>>>> 2013-05-17 12:57:21,939 FATAL - Authorization failure:
>>>>>> Authorization error:
>>>>>> Failed to get the local user id via glexec:  glexec error:
>>>>>> [gLExec]: LCAS
>>>>>> failed. The reason can be found in the syslog.
>>>>>>
>>>>>> Where may I look?
>>>>> As the message says: did you check the syslog, usually
>>>>> /var/log/messages?
>>>>>
>>>> You might need to increase the verbosity levels in /etc/glexec.conf
>>>>
>>>> Cheers, Massimo
>>>>
>>> In BDII log I have the following:
>>>
>>> argus-pap: unrecognized service
>>> ERROR:lcg-info-dynamic-scheduler:Execution error: Missing LRMS
>>> backend command in configuration
>>> argus-pap: unrecognized service
>>> argus-pap: unrecognized service
>>> argus-pap: unrecognized service
>>> argus-pap: unrecognized service
>>> argus-pap: unrecognized service
>>> argus-pap: unrecognized service
>>> argus-pap: unrecognized service
>>> argus-pap: unrecognized service
>>> argus-pap: unrecognized service
>>> argus-pap: unrecognized service
>>> argus-pap: unrecognized service
>>>
>>> when try to restart argus-pap it say nothing, at status report if
>>> gives me the following message
>>>
>>> /etc/init.d/argus-pap status
>>> PAP running!
>>>
>>> REgarding the verbosity level may I modify which line from the
>>> config file?
>>>
>>> [glexec]
>>> linger =3D no
>>>
>>> lcmaps_db_file =3D /etc/lcmaps/lcmaps-glexec.db
>>> lcmaps_log_file =3D /var/log/glexec/lcas_lcmaps.log
>>> lcmaps_debug_level =3D 0
>>> lcmaps_log_level =3D 1
>>>
>>> lcas_db_file =3D /etc/lcas/lcas-glexec.db
>>> lcas_log_file =3D /var/log/glexec/lcas_lcmaps.log
>>> lcas_debug_level =3D 0
>>> lcas_log_level =3D 1
>>>
>>> log_level =3D 1
>>> user_identity_switch_by =3D lcmaps
>>> user_white_list =3D tomcat
>>> omission_private_key_white_list  =3D tomcat
>>> preserve_env_variables =3D
>>> create_target_proxy =3D no
>>> silent_logging =3D no
>>> log_destination =3D syslog
>>> log_file =3D /var/log/glexec/glexec.log
>>>
>>> Thank you
>>> Felix
>>>
>>
>
>



--------------ms040800090003090609030102
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature

MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIID0jCC
A84wggK2oAMCAQICAga7MA0GCSqGSIb3DQEBBQUAMEMxCzAJBgNVBAYTAlBUMQ4wDAYDVQQK
EwVMSVBDQTEkMCIGA1UEAxMbTElQIENlcnRpZmljYXRpb24gQXV0aG9yaXR5MB4XDTEzMDQy
MjEzMzQxNVoXDTE0MDQyMjEzMzQxNVowVTELMAkGA1UEBhMCUFQxDjAMBgNVBAoTBUxJUENB
MQwwCgYDVQQKEwNMSVAxDzANBgNVBAsTBkxpc2JvYTEXMBUGA1UEAxMOR29uY2FsbyBCb3Jn
ZXMwgZ8wDQYJKoZIhvcNAQEBBQADgY0AMIGJAoGBAMO8jZDnePKsdv9ZSe6O94U0Od/CuDzW
iNCcRng5mApXvacIhjbRPPXCfzSTPkebJR7eVmnE68xG+Sajb+0RRXD42YV8qlxth92gyC2y
veqr4+JZTCgYz5v6/PheW+PRND978YCeCsmjxvmF/wLoFs4ZakLC18MOQ7p6l9/HTN+hAgMB
AAGjggE8MIIBODAMBgNVHRMBAf8EAjAAMBEGCWCGSAGG+EIBAQQEAwIFoDAOBgNVHQ8BAf8E
BAMCBLAwHQYDVR0OBBYEFHqOZ0i7Wz1mE8ZKFEyfe0xmXMXRMGsGA1UdIwRkMGKAFEKubveG
Hp7oaO/PeVM4Yk4A8kLsoUekRTBDMQswCQYDVQQGEwJQVDEOMAwGA1UEChMFTElQQ0ExJDAi
BgNVBAMTG0xJUCBDZXJ0aWZpY2F0aW9uIEF1dGhvcml0eYIBADAZBgNVHREEEjAQgQ5nb25j
YWxvQGxpcC5wdDAUBgNVHRIEDTALgQljYUBsaXAucHQwGQYDVR0gBBIwEDAOBgwrBgEEAcx2
CgEBBQEwLQYDVR0fBCYwJDAioCCgHoYcaHR0cDovL2NhLmxpcC5wdC9jcmwvY3JsLmRlcjAN
BgkqhkiG9w0BAQUFAAOCAQEALJdAJVqQDP78hSA8F4mJ9GT0cOjpt6WCU2d7ZDv9wVwDnGPu
YupyAWQo3Oc0GmUAI8XvOmPmM7hBVsdVY5e9WJNiCJEsK5fokZAt1GJCZpo+AKICn15UxfyF
+xmvcl+qsQbJ7zgA/7GP6bCCSpT9JpGd4VQj4ctV/SSJMN6UBgfKaY283Xu5YqFqg6QR70M+
Dsu2DKxPHh2K++L+nI0Q6qQ1dfZwjM/FlxqtMQE3Dpg/YPYvRC/wQvKAMdN3yiBYhuP4F8/O
lbOOP8shAfZC5oCpsvtfxut6NH0TO8YhuJVPDr8S6tM1bYPwXKkffq9Wmp47othz5j/4EKOz
b/NdATGCAnQwggJwAgEBMEkwQzELMAkGA1UEBhMCUFQxDjAMBgNVBAoTBUxJUENBMSQwIgYD
VQQDExtMSVAgQ2VydGlmaWNhdGlvbiBBdXRob3JpdHkCAga7MAkGBSsOAwIaBQCgggGBMBgG
CSqGSIb3DQEJAzELBgkqhkiG9w0BBwEwHAYJKoZIhvcNAQkFMQ8XDTEzMDUyMjExNDc0M1ow
IwYJKoZIhvcNAQkEMRYEFNiiGUBFNEdSMyyYduDaKI7fAjDhMFgGCSsGAQQBgjcQBDFLMEkw
QzELMAkGA1UEBhMCUFQxDjAMBgNVBAoTBUxJUENBMSQwIgYDVQQDExtMSVAgQ2VydGlmaWNh
dGlvbiBBdXRob3JpdHkCAga7MFoGCyqGSIb3DQEJEAILMUugSTBDMQswCQYDVQQGEwJQVDEO
MAwGA1UEChMFTElQQ0ExJDAiBgNVBAMTG0xJUCBDZXJ0aWZpY2F0aW9uIEF1dGhvcml0eQIC
BrswbAYJKoZIhvcNAQkPMV8wXTALBglghkgBZQMEASowCwYJYIZIAWUDBAECMAoGCCqGSIb3
DQMHMA4GCCqGSIb3DQMCAgIAgDANBggqhkiG9w0DAgIBQDAHBgUrDgMCBzANBggqhkiG9w0D
AgIBKDANBgkqhkiG9w0BAQEFAASBgL6TX7Ax8Exn170Yhqi+uWZF7LixWQEr4GBxUVMiRict
VvY+WzOz5gKM3DoWrblsTbfH0Gt8P+DIRVMtTcVyNGLnbt01p5Og9hBJbg77GFwLb0h+jb/K
6rsz7lj+aQFVzN/ch5YjalvNdY4z+HlLhWWfKWkkDxkw/5F4xvWrKbkNAAAAAAAA
--------------ms040800090003090609030102--
=========================================================================
Date:         Wed, 22 May 2013 12:56:20 +0100
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         "Christopher J. Walker" <[log in to unmask]>
Subject:      Re: Authorization errorglexec error: [gLExec]: LCAS failed
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
Message-ID:  <[log in to unmask]>

On 22/05/13 10:50, Gon�alo Borges wrote:

> The message
> 
>    2013-05-17 12:57:21,939 FATAL - Authorization failure: Authorization
> error:
>    Failed to get the local user id via glexec:  glexec error: [gLExec]:
> LCAS
>    failed. The reason can be found in the syslog.
> 
> often happens after a reboot of the Argus or similar, and normally it
> goes away with a proper restart of Argus daemons. Please take attention
> that some of the daemon may be stuck in some incoherent state, and a
> simple stop may not work.

Can you tell me the GGUS ticket this is tracked in please. I had
intended to move to ARGUS for authentication/authorisation, but bugs
such as that make me reluctant.

Chris
=========================================================================
Date:         Wed, 22 May 2013 14:52:24 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Peter Tylka <[log in to unmask]>
Subject:      upgrade LFC UMD1->UMD2
MIME-Version: 1.0
Content-Type: multipart/signed; protocol="application/pkcs7-signature";
              micalg=sha1; boundary="------------ms050104050103070008020308"
Message-ID:  <[log in to unmask]>

Toto je elektronicky podepsana zprava ve formatu MIME.

--------------ms050104050103070008020308
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Hello,

recently we upgraded our LFC ( lfc1.egee.cesnet.cz ) for VOCE and AUGER
from UMD1 to UMD2 .

Downtime was almost 1 day. Longest part was upgrade of two mysql tables.
Upgrade of table Cns_file_replica with 20GB took 2 hours and upgrade of
table Cns_file_metadata with 13GB took 18 hours. The bottleneck was a dis=
k.

Did you experience also quite long downtime during LFC upgrade from UMD1
to UMD2?
Do you have any tips and tricks how to shorten downtime of LFC during
upgrade of mysql tables?

Regards
    Peter Tylka

--=20
Institute of Physics, Prague
email: [log in to unmask]
tel.: +420 266 052 968




--------------ms050104050103070008020308
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: Elektronicky podpis S/MIME

MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIIsDCC
BEYwggMuoAMCAQICBwD06laCmjEwDQYJKoZIhvcNAQEFBQAwXDESMBAGCgmSJomT8ixkARkW
AmN6MRkwFwYKCZImiZPyLGQBGRYJY2VzbmV0LWNhMRIwEAYDVQQKEwlDRVNORVQgQ0ExFzAV
BgNVBAMTDkNFU05FVCBDQSBSb290MB4XDTA5MTIxNTE1NDY0N1oXDTE5MTIxODE1NDY0N1ow
WTESMBAGCgmSJomT8ixkARkWAmN6MRkwFwYKCZImiZPyLGQBGRYJY2VzbmV0LWNhMRIwEAYD
VQQKDAlDRVNORVQgQ0ExFDASBgNVBAMMC0NFU05FVCBDQSAzMIIBIjANBgkqhkiG9w0BAQEF
AAOCAQ8AMIIBCgKCAQEAtfDQCuTKVS3CxUZf/9Q7IgaQPOSx4tP7wrYDGUfCgKLG/OH4ClRV
xqnBTkP+BLakAhHBfpiyJNNzJxj7QGXiow2GUBT8U+o5T1ael5yinpdJhCuFo0ahXk9AC4fl
1FUAj1ETQKaaML18209l6V2ukb4g2O60d5eng7a6vk44onbQuZ8Bt1VC+Emy6MJiDJt3l+us
n9jqx5aJRlJyduvgC7mKSZ5wYLp6BXifkdWYwGxWELBHltN7D8haf4AYqlhb4dIOYrn0xu2j
fxUtAT6nlGThpSJyaOQ4M1atBrzBfctSZrQ1Bxa64io2l+/6otg3D9rqNtMiRFYociW1hoHf
gwIDAQABo4IBDjCCAQowDwYDVR0TAQH/BAUwAwEB/zALBgNVHQ8EBAMCAQYwHQYDVR0OBBYE
FPVdP7yYmYsf8Ujn/keHcQmi3LpFMB8GA1UdIwQYMBaAFJ5BMOPD1U6Mg46jPMl/o20TXYQl
MG0GCCsGAQUFBwEBBGEwXzAlBggrBgEFBQcwAYYZaHR0cDovL29jc3AuY2VzbmV0LWNhLmN6
LzA2BggrBgEFBQcwAoYqaHR0cDovL2NydC5jZXNuZXQtY2EuY3ovQ0VTTkVUX0NBX1Jvb3Qu
Y3J0MDsGA1UdHwQ0MDIwMKAuoCyGKmh0dHA6Ly9jcmwuY2VzbmV0LWNhLmN6L0NFU05FVF9D
QV9Sb290LmNybDANBgkqhkiG9w0BAQUFAAOCAQEAVNjpr4zkRcOFkWneNaWqe1rqCTRQGIrR
qimI/BotZ0JIkj3zIsq+d1yo+EUSf6Pec3jFlzWbLveY6m9PswEEV6a0KJN3qBM2z8U0eMuf
kJbU37YNHeQof+5YqVXhPyz7MZZa6MfINuBPJAzxnDRcc7M3EuomCWffyXZdLS4a3gj2NWWt
wXSkgLg2dqIktkvML0pw2Bm4Px2R9Pn64mWUcFxiXnidPyOnc2C/ve4WDThwBtlEFH69eieo
TBxVskPAsyjE+zDLqfprmQCVxCr1Qamti1h/gpl9U4jGx/Yv8mrqV8hUCx2QSQ7hmBElLj90
4zVxUb7pqxeHvo1ydwuWXjCCBGIwggNKoAMCAQICCDjFAwNnZPbQMA0GCSqGSIb3DQEBBQUA
MFkxEjAQBgoJkiaJk/IsZAEZFgJjejEZMBcGCgmSJomT8ixkARkWCWNlc25ldC1jYTESMBAG
A1UECgwJQ0VTTkVUIENBMRQwEgYDVQQDDAtDRVNORVQgQ0EgMzAeFw0xMjA3MTIwOTA3NDda
Fw0xMzA4MTEwOTA3NDdaMIGJMRIwEAYKCZImiZPyLGQBGRYCY3oxGTAXBgoJkiaJk/IsZAEZ
FgljZXNuZXQtY2ExQjBABgNVBAoMOUluc3RpdHV0ZSBvZiBQaHlzaWNzIG9mIHRoZSBBY2Fk
ZW15IG9mIFNjaWVuY2VzIG9mIHRoZSBDUjEUMBIGA1UEAwwLUGV0ZXIgVHlsa2EwggEiMA0G
CSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQCpIa7KrStlLm9hm9DYhIi8aBA0GGex3VibfZCR
HF96fel6d2pZO0b00l/Byntl7XUbhV99P/FW8GAoyBbXW2YjiA1MbN1rSDlE3+BMgPDhe0Zg
L2PxEQ7GD7Kev3U4aNxPD3oOAsoATgUv/lnnsguFlDGbLIdBHs5neHbL/6I1tut0MvRHzQf+
qOYhex6oShO+Dgh9uic7z1NVIjNZJbA16WsspRwke1t755Nr3/rVhqVSyCWi5YN7W8TbzbRr
VICTngk9WgMYyEfOxu/UD5QddI77c9Z4AY+466NJqFOwxoNjnq7mYKt4mg/vD5ZJQtCrZIRB
1sSlwHHGBTkvW7cZAgMBAAGjgfwwgfkwHQYDVR0OBBYEFAWz24VUK5q6VzeIYvSRD2t8eg+O
MAwGA1UdEwEB/wQCMAAwHwYDVR0jBBgwFoAU9V0/vJiZix/xSOf+R4dxCaLcukUwJwYDVR0g
BCAwHjAMBgoqhkiG90wFAgIBMA4GDCsGAQQBvnkBAgIDATA4BgNVHR8EMTAvMC2gK6Aphido
dHRwOi8vY3JsLmNlc25ldC1jYS5jei9DRVNORVRfQ0FfMy5jcmwwDgYDVR0PAQH/BAQDAgWg
MB0GA1UdJQQWMBQGCCsGAQUFBwMCBggrBgEFBQcDBDAXBgNVHREEEDAOgQx0eWxrYUBmenUu
Y3owDQYJKoZIhvcNAQEFBQADggEBAAvo9o/WSjHZJ181BANbM2KOwHEumN0zZpr5a7vbo/7q
TUAkXl7T/Ky934pzZXtQaOxLv2H+AHnKq8WFqS0a4LrT18fK09+H6Hiu32G7HKTAv4JF4uiL
qa3/TRmx5zwR0RCbvBj85faIb/XyCFRsEFS+S9WZQQoit1TcR5LH3ntBSCiGff+W6RsHluSN
GGA0ZKW8X9dJaUo2UbGR7esP094w3vmOiojNi7nFTbS/Y6GAV/c/BlmzZHtxlWPPTkVHOAkb
oTuWbdh1PZDWAoLw6tT4wHQMTuN1cKfaHO+QMu74nJpE17kWQnTQjsng0y4fPotCcCBZxFdq
OPknLuYg5ywxggNJMIIDRQIBATBlMFkxEjAQBgoJkiaJk/IsZAEZFgJjejEZMBcGCgmSJomT
8ixkARkWCWNlc25ldC1jYTESMBAGA1UECgwJQ0VTTkVUIENBMRQwEgYDVQQDDAtDRVNORVQg
Q0EgMwIIOMUDA2dk9tAwCQYFKw4DAhoFAKCCAbkwGAYJKoZIhvcNAQkDMQsGCSqGSIb3DQEH
ATAcBgkqhkiG9w0BCQUxDxcNMTMwNTIyMTI1MjI0WjAjBgkqhkiG9w0BCQQxFgQU68FPiZ9O
zIe+fnxH3v7LpjjE4e8wbAYJKoZIhvcNAQkPMV8wXTALBglghkgBZQMEASowCwYJYIZIAWUD
BAECMAoGCCqGSIb3DQMHMA4GCCqGSIb3DQMCAgIAgDANBggqhkiG9w0DAgIBQDAHBgUrDgMC
BzANBggqhkiG9w0DAgIBKDB0BgkrBgEEAYI3EAQxZzBlMFkxEjAQBgoJkiaJk/IsZAEZFgJj
ejEZMBcGCgmSJomT8ixkARkWCWNlc25ldC1jYTESMBAGA1UECgwJQ0VTTkVUIENBMRQwEgYD
VQQDDAtDRVNORVQgQ0EgMwIIOMUDA2dk9tAwdgYLKoZIhvcNAQkQAgsxZ6BlMFkxEjAQBgoJ
kiaJk/IsZAEZFgJjejEZMBcGCgmSJomT8ixkARkWCWNlc25ldC1jYTESMBAGA1UECgwJQ0VT
TkVUIENBMRQwEgYDVQQDDAtDRVNORVQgQ0EgMwIIOMUDA2dk9tAwDQYJKoZIhvcNAQEBBQAE
ggEAgdGZ7G1rN4dKBCs+F4Morh+wKA6wdTnyhzbu+xAYVDNh/4l2syyeuVeGabjk/8donPcA
foyqWs2HvbZUPNL+1uqncvgRGtiMXE/8o8ZliWXw0xEySwIP3/vEug05BCew6mL+rKAeeBTe
7lsK3Gs09P5tc51d3UV6ZsCPJzKzWXpA0exh4gYD5VUGpW26+jxFqbRerUypAFgtIf07X0Cs
dEsA6DMPwPUxlfr4DWYaXJxhInFBVVO0o2FH2vZhwR2Jw7bI8mqzH+sgeO/NdgvvBokZ/xrF
4xk1UwCmOXjYRC/PeO17WdgmLu/VJjvR8/bG60Z9lunpKMcOieQrYB4tEwAAAAAAAA==
--------------ms050104050103070008020308--
=========================================================================
Date:         Wed, 22 May 2013 14:26:24 +0100
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Sam Skipsey <[log in to unmask]>
Subject:      Re: upgrade LFC UMD1->UMD2
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary=001a11c225742ebb9204dd4e8356
Message-ID:  <[log in to unmask]>

--001a11c225742ebb9204dd4e8356
Content-Type: text/plain; charset=ISO-8859-1

So, in general, mysql table operations are improved by properly tuned
settings for your MySQL server.
I assume that your /etc/my.cnf includes some tuning for the buffer pool
size and other things with small default settings?
If not:

innodb_buffer_pool_size=2G


(or set the buffer pool size to any other number that's a reasonable chunk
of the available system memory on the machine)

There's a few other settings, but that's a good start.

Sam



On 22 May 2013 13:52, Peter Tylka <[log in to unmask]> wrote:

> Hello,
>
> recently we upgraded our LFC ( lfc1.egee.cesnet.cz ) for VOCE and AUGER
> from UMD1 to UMD2 .
>
> Downtime was almost 1 day. Longest part was upgrade of two mysql tables.
> Upgrade of table Cns_file_replica with 20GB took 2 hours and upgrade of
> table Cns_file_metadata with 13GB took 18 hours. The bottleneck was a disk.
>
> Did you experience also quite long downtime during LFC upgrade from UMD1
> to UMD2?
> Do you have any tips and tricks how to shorten downtime of LFC during
> upgrade of mysql tables?
>
> Regards
>     Peter Tylka
>
> --
> Institute of Physics, Prague
> email: [log in to unmask]
> tel.: +420 266 052 968
>
>
>
>

--001a11c225742ebb9204dd4e8356
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">So, in general, mysql table operations are improved by pro=
perly tuned settings for your MySQL server.<div style>I assume that your /e=
tc/my.cnf includes some tuning for the buffer pool size and other things wi=
th small default settings?</div>
<div style>If not:</div><div style><br></div><div style>innodb_buffer_pool_=
size=3D2G=A0</div><div style><br></div><div style><br></div><div style>(or =
set the buffer pool size to any other number that&#39;s a reasonable chunk =
of the available system memory on the machine)</div>
<div style><br></div><div style>There&#39;s a few other settings, but that&=
#39;s a good start.</div><div style><br></div><div style>Sam</div><div styl=
e><br></div><div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">O=
n 22 May 2013 13:52, Peter Tylka <span dir=3D"ltr">&lt;<a href=3D"mailto:ty=
[log in to unmask]" target=3D"_blank">[log in to unmask]</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">Hello,<br>
<br>
recently we upgraded our LFC ( <a href=3D"http://lfc1.egee.cesnet.cz" targe=
t=3D"_blank">lfc1.egee.cesnet.cz</a> ) for VOCE and AUGER<br>
from UMD1 to UMD2 .<br>
<br>
Downtime was almost 1 day. Longest part was upgrade of two mysql tables.<br=
>
Upgrade of table Cns_file_replica with 20GB took 2 hours and upgrade of<br>
table Cns_file_metadata with 13GB took 18 hours. The bottleneck was a disk.=
<br>
<br>
Did you experience also quite long downtime during LFC upgrade from UMD1<br=
>
to UMD2?<br>
Do you have any tips and tricks how to shorten downtime of LFC during<br>
upgrade of mysql tables?<br>
<br>
Regards<br>
<span class=3D"HOEnZb"><font color=3D"#888888">=A0 =A0 Peter Tylka<br>
<br>
--<br>
Institute of Physics, Prague<br>
email: <a href=3D"mailto:[log in to unmask]">[log in to unmask]</a><br>
tel.: <a href=3D"tel:%2B420%20266%20052%20968" value=3D"+420266052968">+420=
 266 052 968</a><br>
<br>
<br>
<br>
</font></span></blockquote></div><br></div></div>

--001a11c225742ebb9204dd4e8356--
=========================================================================
Date:         Wed, 22 May 2013 19:28:57 +0100
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         =?ISO-8859-1?Q?Gon=E7alo_Borges?= <[log in to unmask]>
Organization: LIP
Subject:      Re: Authorization errorglexec error: [gLExec]: LCAS failed
Comments: cc: "Christopher J. Walker" <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: multipart/signed; protocol="application/pkcs7-signature";
              micalg=sha1; boundary="------------ms010509000105010308000107"
Message-ID:  <[log in to unmask]>

This is a cryptographically signed message in MIME format.

--------------ms010509000105010308000107
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: quoted-printable

Hi Chris

You misinterpreted my email. Argus is working fine with GLEXEC in the=20
WNs. There are some sporadic situations where you may need to
restart the daemons (like in any other service) or assign more memory to =

java, but generally, these are not very frequent situations.

The GGUS ticket I was pointing (#93508) was regarding my latest Argus SR =

test which found the same message "argus-pap: unrecognized service"
as reported at the beginning of this thread but did not broke functionali=
ty.

Cheers
Goncalo



On 05/22/2013 12:56 PM, Christopher J. Walker wrote:
> On 22/05/13 10:50, Gon=E7alo Borges wrote:
>
>> The message
>>
>>     2013-05-17 12:57:21,939 FATAL - Authorization failure: Authorizati=
on
>> error:
>>     Failed to get the local user id via glexec:  glexec error: [gLExec=
]:
>> LCAS
>>     failed. The reason can be found in the syslog.
>>
>> often happens after a reboot of the Argus or similar, and normally it
>> goes away with a proper restart of Argus daemons. Please take attentio=
n
>> that some of the daemon may be stuck in some incoherent state, and a
>> simple stop may not work.
> Can you tell me the GGUS ticket this is tracked in please. I had
> intended to move to ARGUS for authentication/authorisation, but bugs
> such as that make me reluctant.
>
> Chris



--------------ms010509000105010308000107
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature

MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIID0jCC
A84wggK2oAMCAQICAga7MA0GCSqGSIb3DQEBBQUAMEMxCzAJBgNVBAYTAlBUMQ4wDAYDVQQK
EwVMSVBDQTEkMCIGA1UEAxMbTElQIENlcnRpZmljYXRpb24gQXV0aG9yaXR5MB4XDTEzMDQy
MjEzMzQxNVoXDTE0MDQyMjEzMzQxNVowVTELMAkGA1UEBhMCUFQxDjAMBgNVBAoTBUxJUENB
MQwwCgYDVQQKEwNMSVAxDzANBgNVBAsTBkxpc2JvYTEXMBUGA1UEAxMOR29uY2FsbyBCb3Jn
ZXMwgZ8wDQYJKoZIhvcNAQEBBQADgY0AMIGJAoGBAMO8jZDnePKsdv9ZSe6O94U0Od/CuDzW
iNCcRng5mApXvacIhjbRPPXCfzSTPkebJR7eVmnE68xG+Sajb+0RRXD42YV8qlxth92gyC2y
veqr4+JZTCgYz5v6/PheW+PRND978YCeCsmjxvmF/wLoFs4ZakLC18MOQ7p6l9/HTN+hAgMB
AAGjggE8MIIBODAMBgNVHRMBAf8EAjAAMBEGCWCGSAGG+EIBAQQEAwIFoDAOBgNVHQ8BAf8E
BAMCBLAwHQYDVR0OBBYEFHqOZ0i7Wz1mE8ZKFEyfe0xmXMXRMGsGA1UdIwRkMGKAFEKubveG
Hp7oaO/PeVM4Yk4A8kLsoUekRTBDMQswCQYDVQQGEwJQVDEOMAwGA1UEChMFTElQQ0ExJDAi
BgNVBAMTG0xJUCBDZXJ0aWZpY2F0aW9uIEF1dGhvcml0eYIBADAZBgNVHREEEjAQgQ5nb25j
YWxvQGxpcC5wdDAUBgNVHRIEDTALgQljYUBsaXAucHQwGQYDVR0gBBIwEDAOBgwrBgEEAcx2
CgEBBQEwLQYDVR0fBCYwJDAioCCgHoYcaHR0cDovL2NhLmxpcC5wdC9jcmwvY3JsLmRlcjAN
BgkqhkiG9w0BAQUFAAOCAQEALJdAJVqQDP78hSA8F4mJ9GT0cOjpt6WCU2d7ZDv9wVwDnGPu
YupyAWQo3Oc0GmUAI8XvOmPmM7hBVsdVY5e9WJNiCJEsK5fokZAt1GJCZpo+AKICn15UxfyF
+xmvcl+qsQbJ7zgA/7GP6bCCSpT9JpGd4VQj4ctV/SSJMN6UBgfKaY283Xu5YqFqg6QR70M+
Dsu2DKxPHh2K++L+nI0Q6qQ1dfZwjM/FlxqtMQE3Dpg/YPYvRC/wQvKAMdN3yiBYhuP4F8/O
lbOOP8shAfZC5oCpsvtfxut6NH0TO8YhuJVPDr8S6tM1bYPwXKkffq9Wmp47othz5j/4EKOz
b/NdATGCAnQwggJwAgEBMEkwQzELMAkGA1UEBhMCUFQxDjAMBgNVBAoTBUxJUENBMSQwIgYD
VQQDExtMSVAgQ2VydGlmaWNhdGlvbiBBdXRob3JpdHkCAga7MAkGBSsOAwIaBQCgggGBMBgG
CSqGSIb3DQEJAzELBgkqhkiG9w0BBwEwHAYJKoZIhvcNAQkFMQ8XDTEzMDUyMjE4Mjg1N1ow
IwYJKoZIhvcNAQkEMRYEFGE+RZEt7higeLM0hGpoqcqHzL2TMFgGCSsGAQQBgjcQBDFLMEkw
QzELMAkGA1UEBhMCUFQxDjAMBgNVBAoTBUxJUENBMSQwIgYDVQQDExtMSVAgQ2VydGlmaWNh
dGlvbiBBdXRob3JpdHkCAga7MFoGCyqGSIb3DQEJEAILMUugSTBDMQswCQYDVQQGEwJQVDEO
MAwGA1UEChMFTElQQ0ExJDAiBgNVBAMTG0xJUCBDZXJ0aWZpY2F0aW9uIEF1dGhvcml0eQIC
BrswbAYJKoZIhvcNAQkPMV8wXTALBglghkgBZQMEASowCwYJYIZIAWUDBAECMAoGCCqGSIb3
DQMHMA4GCCqGSIb3DQMCAgIAgDANBggqhkiG9w0DAgIBQDAHBgUrDgMCBzANBggqhkiG9w0D
AgIBKDANBgkqhkiG9w0BAQEFAASBgIfcIVl4Ezhc/q1/7UhPfGBYYYe/M2xtz7vWqxnL6TNl
FSOPTkBN86StFDe3qS8Epx2rDTnOrnSGeFTrDDBqwHSIgnuWF0Pwq5NGrmM2uwVVBbe/H+6h
AOiuS57j5s7awWRDKX4l+obaLCm0ZcD+uZKqhT6GvPW1uNo/pOy/BXcNAAAAAAAA
--------------ms010509000105010308000107--
=========================================================================
Date:         Mon, 27 May 2013 08:55:55 +0000
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Miguel Gila <[log in to unmask]>
Subject:      Re: Authorization errorglexec error: [gLExec]: LCAS failed
In-Reply-To:  <[log in to unmask]>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Message-ID:  <[log in to unmask]>

Hi Felix,

As a side note, regarding the message on the BDII logs:

argus-pap: unrecognized service


We recently saw it in our system and it seems the cause is the BDII user
(ldap) not having the right permissions. The way we solved it is by
modifying=20
/var/lib/bdii/gip/provider/glite-info-glue2-provider-service-argus to run
/usr/bin/glite-info-glue2-multi via sudo.

This is the line we added in the sudoers file:

ldap  ALL=3D(ALL)       NOPASSWD:
/var/lib/bdii/gip/provider/glite-info-glue2-provider-service-argus,/usr/bin
/glite-info-glue2-multi

In the end this makes your BDII to publish correct glue2 information about
the ARGUS service status.

My 2 cents.

Miguel


On 5/17/13 2:51 PM, "Felix Farcas" <[log in to unmask]> wrote:

>argus-pap: unrecognized service
=========================================================================
Date:         Mon, 27 May 2013 11:06:20 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Jan Svec <[log in to unmask]>
Subject:      top level BDIIs - how many?
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\))
Message-ID:  <[log in to unmask]>

Hi all,

at NGI_CZ we have two sites with approx. 4200 jobslots total, each site =
running one top level BDII (one KVM virtual host (8 core, 8GB RAM), one =
physical machine (8 core (E5420), 16GB RAM). Both BDIIs are accessed via =
DNS alias bdii1rr (round-robin). The problem is that both BDIIs are =
quite heavily loaded (15-16) so sometimes their response is very slow. I =
tried to add another BDII to the DNS pool and the load dropped to 10-11. =
Both BDIIs are running UMD2 middleware.
My question is if this high load is normal and if there are any "best =
practices" regarding number/performance of BDII servers in dependence on =
number of jobslots.

Thank you for any help.
Best regards
Jan

--=20
Jan Svec
Institute of Physics AS CR
Prague=
=========================================================================
Date:         Mon, 27 May 2013 11:22:38 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Tiziana Ferrari <[log in to unmask]>
Subject:      Re: top level BDIIs - how many?
Comments: To: Jan Svec <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Message-ID:  <[log in to unmask]>

Hi Jan

performance problems with UMD-2 versions of top-BDII have been already 
reported on this list recently.

Hw requirements for top-BDII were recently updated (quoting Maria Alandes):
"I have updated the BDII sys admin guide with new HW requirements. The 
old ones are obsolete for top BDIIs after GLUE 2 is widely published by 
all EGI sites."
https://tomtools.cern.ch/confluence/download/attachments/983044/EMI_BDII_sysadmin.pdf

You could consider a migration to the EMI-3 version, that has already 
passed staged rollout and is available in UMD-3:
https://wiki.egi.eu/wiki/UMD-3:UMD-3.0.0#emi.bdii-top.sl6.x86_64

Tiziana

On 27/05/2013 11:06, Jan Svec wrote:
> Hi all,
>
> at NGI_CZ we have two sites with approx. 4200 jobslots total, each site running one top level BDII (one KVM virtual host (8 core, 8GB RAM), one physical machine (8 core (E5420), 16GB RAM). Both BDIIs are accessed via DNS alias bdii1rr (round-robin). The problem is that both BDIIs are quite heavily loaded (15-16) so sometimes their response is very slow. I tried to add another BDII to the DNS pool and the load dropped to 10-11. Both BDIIs are running UMD2 middleware.
> My question is if this high load is normal and if there are any "best practices" regarding number/performance of BDII servers in dependence on number of jobslots.
>
> Thank you for any help.
> Best regards
> Jan
>

-- 
Tiziana Ferrari
EGI.eu Operations
0031 (0)6 3037.2691
=========================================================================
Date:         Mon, 27 May 2013 10:18:57 +0000
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Maria Alandes Pradillo <[log in to unmask]>
Subject:      Re: top level BDIIs - how many?
In-Reply-To:  <[log in to unmask]>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Message-ID:  <[log in to unmask]>

Dear Jan,

I am not sure I have understood the deployment of your top level BDII. Plea=
se, let me know if this is correct:

There are two sites. The two of them run a top BDII with 1 VM + 1 physical =
machine.=20

In the end you have one alias, bdii1rr, with the 2 VMS + 2 physical machine=
s from the two sites behind.

As far as HW requirements is concerned, everything is fine. However, if you=
 are running an UMD 2 version you are not benefitting from the very latest =
performance improvements we have released in EMI 3 to better cope with the =
GLUE 2 load. The amount of GLUE 2 information is now higher than GLUE 1.3 a=
nd we have tuned the DB backend and the LDAP configuration in EMI 3. We hav=
e observed from the sites who reported performance problems that when they =
upgrade to EMI 3, everything works fine again.=20

So in this particular case I suggest upgrading to EMI 3 to improve the perf=
ormance. Let me know if you have further questions.

Regards,
Maria

> -----Original Message-----
> From: LHC Computer Grid - Rollout [mailto:[log in to unmask]] On
> Behalf Of Jan Svec
> Sent: 27 May 2013 11:06
> To: [log in to unmask]
> Subject: [LCG-ROLLOUT] top level BDIIs - how many?
>=20
> Hi all,
>=20
> at NGI_CZ we have two sites with approx. 4200 jobslots total, each site r=
unning
> one top level BDII (one KVM virtual host (8 core, 8GB RAM), one physical
> machine (8 core (E5420), 16GB RAM). Both BDIIs are accessed via DNS alias
> bdii1rr (round-robin). The problem is that both BDIIs are quite heavily l=
oaded
> (15-16) so sometimes their response is very slow. I tried to add another =
BDII to
> the DNS pool and the load dropped to 10-11. Both BDIIs are running UMD2
> middleware.
> My question is if this high load is normal and if there are any "best pra=
ctices"
> regarding number/performance of BDII servers in dependence on number of
> jobslots.
>=20
> Thank you for any help.
> Best regards
> Jan
>=20
> --
> Jan Svec
> Institute of Physics AS CR
> Prague
=========================================================================
Date:         Mon, 27 May 2013 12:26:19 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Jeff Templon <[log in to unmask]>
Subject:      Maui, Scheduling Performance,
              unfillable clusters and other silliness
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\))
Message-ID:  <[log in to unmask]>

Hi All,

  We recently installed nodes here with 32 cores.  After that =
installation, maui was having a very tough time =85 timeouts on =
commands, trouble keeping the cluster full, slow scheduling ramp up, and =
lots of messages saying that MJobReserve could not create a reservation =
=85 thousands of such messages per scheduling cycle.  As far as I can =
tell, each job in the queue generates such a message.  When you have =
7000 waiting jobs, it can take up to a minute to "fail" all those jobs =
like this.

  We dug in the source code, google, the maui users mailing list, saw =
several people (notably our own Mario Kadastik) who had witnessed the =
same phenomenon, but no real answers to the question "what is going on =
and how can I solve it".  In desperation I ran the command "schedconfig" =
(I wonder whether that particular sentence has ever been said by anyone =
before in the history of mankind?) and there was this mysterious =
parameter

	RESDEPTH

which we did not set in maui.cfg, but had a value of 24 (must be the =
default).  The manual says:

    specifies the maximum number of reservations which can be on any =
single node.

Previously we had no more than 12 cores on a node.  Now 32.  I set this =
parameter to 48 and now those messages are gone and Maui keeps the =
cluster full and is much more responsive.

		J "whaddya mean by 'dynamic allocation'?" T
=========================================================================
Date:         Mon, 27 May 2013 12:36:00 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Jan Svec <[log in to unmask]>
Subject:      Re: top level BDIIs - how many?
In-Reply-To:  <[log in to unmask]>
Content-Type: text/plain; charset=windows-1252
Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\))
Content-Transfer-Encoding: quoted-printable
Message-ID:  <[log in to unmask]>

Dear Maria,

On 27. 5. 2013, at 12:18, Maria Alandes Pradillo =
<[log in to unmask]> wrote:

> Dear Jan,
>=20
> I am not sure I have understood the deployment of your top level BDII. =
Please, let me know if this is correct:
>=20
> There are two sites. The two of them run a top BDII with 1 VM + 1 =
physical machine.=20
>=20
> In the end you have one alias, bdii1rr, with the 2 VMS + 2 physical =
machines from the two sites behind.

Yes, that is correct.

>=20
> As far as HW requirements is concerned, everything is fine. However, =
if you are running an UMD 2 version you are not benefitting from the =
very latest performance improvements we have released in EMI 3 to better =
cope with the GLUE 2 load. The amount of GLUE 2 information is now =
higher than GLUE 1.3 and we have tuned the DB backend and the LDAP =
configuration in EMI 3. We have observed from the sites who reported =
performance problems that when they upgrade to EMI 3, everything works =
fine again.=20
>=20
> So in this particular case I suggest upgrading to EMI 3 to improve the =
performance. Let me know if you have further questions.

OK, I will upgrade to EMI 3 and let you know, if it helped. BTW is this =
release working fine with SL 5? Alessandro Paolini mentioned some =
problems with SL5 in ticket 92959, if I remember correctly=85

Thank you,
Jan=
=========================================================================
Date:         Mon, 27 May 2013 12:38:18 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Jan Svec <[log in to unmask]>
Subject:      Re: top level BDIIs - how many?
In-Reply-To:  <[log in to unmask]>
Content-Type: text/plain; charset=iso-8859-1
Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\))
Content-Transfer-Encoding: quoted-printable
Message-ID:  <[log in to unmask]>

Hi Tiziana,=20

thank you, I will upgrade to EMI3 then.

Cheers
Jan

On 27. 5. 2013, at 11:22, Tiziana Ferrari <[log in to unmask]> =
wrote:

> Hi Jan
>=20
> performance problems with UMD-2 versions of top-BDII have been already =
reported on this list recently.
>=20
> Hw requirements for top-BDII were recently updated (quoting Maria =
Alandes):
> "I have updated the BDII sys admin guide with new HW requirements. The =
old ones are obsolete for top BDIIs after GLUE 2 is widely published by =
all EGI sites."
> =
https://tomtools.cern.ch/confluence/download/attachments/983044/EMI_BDII_s=
ysadmin.pdf
>=20
> You could consider a migration to the EMI-3 version, that has already =
passed staged rollout and is available in UMD-3:
> https://wiki.egi.eu/wiki/UMD-3:UMD-3.0.0#emi.bdii-top.sl6.x86_64
>=20
> Tiziana
>=20
> On 27/05/2013 11:06, Jan Svec wrote:
>> Hi all,
>>=20
>> at NGI_CZ we have two sites with approx. 4200 jobslots total, each =
site running one top level BDII (one KVM virtual host (8 core, 8GB RAM), =
one physical machine (8 core (E5420), 16GB RAM). Both BDIIs are accessed =
via DNS alias bdii1rr (round-robin). The problem is that both BDIIs are =
quite heavily loaded (15-16) so sometimes their response is very slow. I =
tried to add another BDII to the DNS pool and the load dropped to 10-11. =
Both BDIIs are running UMD2 middleware.
>> My question is if this high load is normal and if there are any "best =
practices" regarding number/performance of BDII servers in dependence on =
number of jobslots.
>>=20
>> Thank you for any help.
>> Best regards
>> Jan
>>=20
>=20
> --=20
> Tiziana Ferrari
> EGI.eu Operations
> 0031 (0)6 3037.2691
=========================================================================
Date:         Mon, 27 May 2013 12:13:00 +0000
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Maria Alandes Pradillo <[log in to unmask]>
Subject:      Re: top level BDIIs - how many?
In-Reply-To:  <[log in to unmask]>
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Message-ID:  <[log in to unmask]>

Dear Jan,

I can=B4t recall any particular problem with SL5. Both SL6 and SL5 should w=
ork fine.

Regards,
Maria

> -----Original Message-----
> From: LHC Computer Grid - Rollout [mailto:[log in to unmask]] On
> Behalf Of Jan Svec
> Sent: 27 May 2013 12:36
> To: [log in to unmask]
> Subject: Re: [LCG-ROLLOUT] top level BDIIs - how many?
>=20
> Dear Maria,
>=20
> On 27. 5. 2013, at 12:18, Maria Alandes Pradillo
> <[log in to unmask]> wrote:
>=20
> > Dear Jan,
> >
> > I am not sure I have understood the deployment of your top level BDII. =
Please,
> let me know if this is correct:
> >
> > There are two sites. The two of them run a top BDII with 1 VM + 1 physi=
cal
> machine.
> >
> > In the end you have one alias, bdii1rr, with the 2 VMS + 2 physical mac=
hines
> from the two sites behind.
>=20
> Yes, that is correct.
>=20
> >
> > As far as HW requirements is concerned, everything is fine. However, if=
 you
> are running an UMD 2 version you are not benefitting from the very latest
> performance improvements we have released in EMI 3 to better cope with th=
e
> GLUE 2 load. The amount of GLUE 2 information is now higher than GLUE 1.3
> and we have tuned the DB backend and the LDAP configuration in EMI 3. We
> have observed from the sites who reported performance problems that when
> they upgrade to EMI 3, everything works fine again.
> >
> > So in this particular case I suggest upgrading to EMI 3 to improve the
> performance. Let me know if you have further questions.
>=20
> OK, I will upgrade to EMI 3 and let you know, if it helped. BTW is this r=
elease
> working fine with SL 5? Alessandro Paolini mentioned some problems with S=
L5 in
> ticket 92959, if I remember correctly.
>=20
> Thank you,
> Jan
=========================================================================
Date:         Mon, 27 May 2013 14:17:10 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Alessandro Paolini <[log in to unmask]>
Subject:      Re: top level BDIIs - how many?
Comments: cc: Jan Svec <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: quoted-printable
Message-ID:  <[log in to unmask]>

Il 27/05/2013 12:36, Jan Svec ha scritto:
>
>> As far as HW requirements is concerned, everything is fine. However, i=
f you are running an UMD 2 version you are not benefitting from the very =
latest performance improvements we have released in EMI 3 to better cope =
with the GLUE 2 load. The amount of GLUE 2 information is now higher than=
 GLUE 1.3 and we have tuned the DB backend and the LDAP configuration in =
EMI 3. We have observed from the sites who reported performance problems =
that when they upgrade to EMI 3, everything works fine again.
>>
>> So in this particular case I suggest upgrading to EMI 3 to improve the=
 performance. Let me know if you have further questions.
> OK, I will upgrade to EMI 3 and let you know, if it helped. BTW is this=
 release working fine with SL 5? Alessandro Paolini mentioned some proble=
ms with SL5 in ticket 92959, if I remember correctly=85
>
> Thank you,
> Jan
hi Jan,
the ticket you mentioned=20
https://ggus.eu/tech/ticket_show.php?ticket=3D92959 it was opened for loa=
d=20
issues similar to the yours, solved with the tuning as said by Maria,=20
there is no mention to problems on sl5 (our top-BDIIs are on sl6)

Perhaps you read this other ticket=20
https://ggus.eu/tech/ticket_show.php?ticket=3D93878 that I opened because=
=20
when upgrading a VOMS server I didn't notice an important operation to=20
perform for configuring the bdii and that wasn't mentioned in the VOMS=20
guide.

Cheers,
Alessandro
=========================================================================
Date:         Mon, 27 May 2013 14:31:59 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Jan Svec <[log in to unmask]>
Subject:      Re: top level BDIIs - how many?
In-Reply-To:  <[log in to unmask]>
Content-Type: text/plain; charset=windows-1252
Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\))
Content-Transfer-Encoding: quoted-printable
Message-ID:  <[log in to unmask]>

On 27. 5. 2013, at 14:17, Alessandro Paolini =
<[log in to unmask]> wrote:

> Il 27/05/2013 12:36, Jan Svec ha scritto:
>>=20
>>> As far as HW requirements is concerned, everything is fine. However, =
if you are running an UMD 2 version you are not benefitting from the =
very latest performance improvements we have released in EMI 3 to better =
cope with the GLUE 2 load. The amount of GLUE 2 information is now =
higher than GLUE 1.3 and we have tuned the DB backend and the LDAP =
configuration in EMI 3. We have observed from the sites who reported =
performance problems that when they upgrade to EMI 3, everything works =
fine again.
>>>=20
>>> So in this particular case I suggest upgrading to EMI 3 to improve =
the performance. Let me know if you have further questions.
>> OK, I will upgrade to EMI 3 and let you know, if it helped. BTW is =
this release working fine with SL 5? Alessandro Paolini mentioned some =
problems with SL5 in ticket 92959, if I remember correctly=85
>>=20
>> Thank you,
>> Jan
> hi Jan,
> the ticket you mentioned =
https://ggus.eu/tech/ticket_show.php?ticket=3D92959 it was opened for =
load issues similar to the yours, solved with the tuning as said by =
Maria, there is no mention to problems on sl5 (our top-BDIIs are on sl6)
>=20
> Perhaps you read this other ticket =
https://ggus.eu/tech/ticket_show.php?ticket=3D93878 that I opened =
because when upgrading a VOMS server I didn't notice an important =
operation to perform for configuring the bdii and that wasn't mentioned =
in the VOMS guide.
>=20
> Cheers,
> Alessandro

Hi Alessando,

I meant the ticket 92959 update#26, but maybe I am missing the context =
and it has nothing to do with changes made in EMI3 :)

Cheers
Jan=
=========================================================================
Date:         Mon, 27 May 2013 14:45:41 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Alessandro Paolini <[log in to unmask]>
Subject:      Re: top level BDIIs - how many?
Comments: cc: Jan Svec <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: quoted-printable
Message-ID:  <[log in to unmask]>

Il 27/05/2013 14:31, Jan Svec ha scritto:
> On 27. 5. 2013, at 14:17, Alessandro Paolini <[log in to unmask]
NFN.IT> wrote:
>
>> Il 27/05/2013 12:36, Jan Svec ha scritto:
>>>> As far as HW requirements is concerned, everything is fine. However,=
 if you are running an UMD 2 version you are not benefitting from the ver=
y latest performance improvements we have released in EMI 3 to better cop=
e with the GLUE 2 load. The amount of GLUE 2 information is now higher th=
an GLUE 1.3 and we have tuned the DB backend and the LDAP configuration i=
n EMI 3. We have observed from the sites who reported performance problem=
s that when they upgrade to EMI 3, everything works fine again.
>>>>
>>>> So in this particular case I suggest upgrading to EMI 3 to improve t=
he performance. Let me know if you have further questions.
>>> OK, I will upgrade to EMI 3 and let you know, if it helped. BTW is th=
is release working fine with SL 5? Alessandro Paolini mentioned some prob=
lems with SL5 in ticket 92959, if I remember correctly=85
>>>
>>> Thank you,
>>> Jan
>> hi Jan,
>> the ticket you mentioned https://ggus.eu/tech/ticket_show.php?ticket=3D=
92959 it was opened for load issues similar to the yours, solved with the=
 tuning as said by Maria, there is no mention to problems on sl5 (our top=
-BDIIs are on sl6)
>>
>> Perhaps you read this other ticket https://ggus.eu/tech/ticket_show.ph=
p?ticket=3D93878 that I opened because when upgrading a VOMS server I did=
n't notice an important operation to perform for configuring the bdii and=
 that wasn't mentioned in the VOMS guide.
>>
>> Cheers,
>> Alessandro
> Hi Alessando,
>
> I meant the ticket 92959 update#26, but maybe I am missing the context =
and it has nothing to do with changes made in EMI3 :)
>
> Cheers
> Jan
Hi Jan,
ok: Maria asked me to enable the LDAP monitoring interface for=20
understanding how many ldap connections there were, so before doing it=20
on the production server, I was trying enabling it on a test BDII which=20
is on sl5, but I run into some problems because the things to modify are=20
slightly different in case of sl5.

In general the latest EMI-3 update works fine on both sl5 and sl6 version=
s

cheers,
Alessandro
=========================================================================
Date:         Mon, 27 May 2013 14:47:21 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Alessandro Paolini <[log in to unmask]>
Subject:      Re: top level BDIIs - how many?
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: quoted-printable
Message-ID:  <[log in to unmask]>

Il 27/05/2013 14:45, Alessandro Paolini ha scritto:
> Il 27/05/2013 14:31, Jan Svec ha scritto:
>> On 27. 5. 2013, at 14:17, Alessandro Paolini=20
>> <[log in to unmask]> wrote:
>>
>>> Il 27/05/2013 12:36, Jan Svec ha scritto:
>>>>> As far as HW requirements is concerned, everything is fine.=20
>>>>> However, if you are running an UMD 2 version you are not=20
>>>>> benefitting from the very latest performance improvements we have=20
>>>>> released in EMI 3 to better cope with the GLUE 2 load. The amount=20
>>>>> of GLUE 2 information is now higher than GLUE 1.3 and we have=20
>>>>> tuned the DB backend and the LDAP configuration in EMI 3. We have=20
>>>>> observed from the sites who reported performance problems that=20
>>>>> when they upgrade to EMI 3, everything works fine again.
>>>>>
>>>>> So in this particular case I suggest upgrading to EMI 3 to improve=20
>>>>> the performance. Let me know if you have further questions.
>>>> OK, I will upgrade to EMI 3 and let you know, if it helped. BTW is=20
>>>> this release working fine with SL 5? Alessandro Paolini mentioned=20
>>>> some problems with SL5 in ticket 92959, if I remember correctly=85
>>>>
>>>> Thank you,
>>>> Jan
>>> hi Jan,
>>> the ticket you mentioned=20
>>> https://ggus.eu/tech/ticket_show.php?ticket=3D92959 it was opened for=
=20
>>> load issues similar to the yours, solved with the tuning as said by=20
>>> Maria, there is no mention to problems on sl5 (our top-BDIIs are on=20
>>> sl6)
>>>
>>> Perhaps you read this other ticket=20
>>> https://ggus.eu/tech/ticket_show.php?ticket=3D93878 that I opened=20
>>> because when upgrading a VOMS server I didn't notice an important=20
>>> operation to perform for configuring the bdii and that wasn't=20
>>> mentioned in the VOMS guide.
>>>
>>> Cheers,
>>> Alessandro
>> Hi Alessando,
>>
>> I meant the ticket 92959 update#26, but maybe I am missing the=20
>> context and it has nothing to do with changes made in EMI3 :)
>>
>> Cheers
>> Jan
> Hi Jan,
> ok: Maria asked me to enable the LDAP monitoring interface for=20
> understanding how many ldap connections there were, so before doing it=20
> on the production server, I was trying enabling it on a test BDII=20
> which is on sl5, but I run into some problems because the things to=20
> modify are slightly different in case of sl5.
>
> In general the latest EMI-3 update works fine on both sl5 and sl6=20
> versions
>
> cheers,
> Alessandro
>
...and the important modification to do is reported in the update#10=20
https://ggus.eu/tech/ticket_show.php?ticket=3D92959#update#10

cheers,
Alessandro
=========================================================================
Date:         Mon, 27 May 2013 14:50:29 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Jan Svec <[log in to unmask]>
Subject:      Re: top level BDIIs - how many?
In-Reply-To:  <[log in to unmask]>
Content-Type: text/plain; charset=windows-1252
Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\))
Content-Transfer-Encoding: quoted-printable
Message-ID:  <[log in to unmask]>

On 27. 5. 2013, at 14:45, Alessandro Paolini =
<[log in to unmask]> wrote:
> Hi Jan,
> ok: Maria asked me to enable the LDAP monitoring interface for =
understanding how many ldap connections there were, so before doing it =
on the production server, I was trying enabling it on a test BDII which =
is on sl5, but I run into some problems because the things to modify are =
slightly different in case of sl5.
>=20
> In general the latest EMI-3 update works fine on both sl5 and sl6 =
versions
>=20

Hi Alessandro,

thank you for clarification, I will update to EMI-3 then.

Cheers
Jan
=========================================================================
Date:         Mon, 27 May 2013 13:29:08 +0000
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Maria Alandes Pradillo <[log in to unmask]>
Subject:      Re: top level BDIIs - how many?
In-Reply-To:  <[log in to unmask]>
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Message-ID:  <[log in to unmask]>

Dear Jan,

> > In general the latest EMI-3 update works fine on both sl5 and sl6 versi=
ons
> >
>=20
> Hi Alessandro,
>=20
> thank you for clarification, I will update to EMI-3 then.

As mentioned by Alessandro, there is an important workaround that needs to =
be manually applied:

In line 150, add the following line in /etc/init.d/bdii:

$RUNUSER -s /bin/sh ${BDII_USER} -c "ln -sf ${DB_CONFIG}_top ${SLAPD_DB_DIR=
}/glue/DB_CONFIG"

This is a known issue already tracked in Savannah and it was detected when =
debugging the performance problems at CNAF:=20
https://savannah.cern.ch/bugs/?101090

This will be released in a next version of the BDII but for the time being =
it has to be done manually.=20

Now that EMI is over I=B4m planning to document the existing known issues i=
n the Information System web pages. I=B4ll send you the link when this is r=
eady.

Regards,
Maria
=========================================================================
Date:         Tue, 28 May 2013 15:08:44 +0000
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Maria Alandes Pradillo <[log in to unmask]>
Subject:      Publication of MaxCPUTime and MaxWallTime glue attributes
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Message-ID:  <[log in to unmask]>

Dear all,

I'm trying to understand why there are so many WLCG sites publishing the de=
fault values of GLUE2ComputingShareMaxCPUTime and (less cases) GLUE2Computi=
ngShareMaxWallTime.

In the case of CREAM, these two variables are published by default as:

GLUE2ComputingShareMaxCPUTime: 999999999
GLUE2ComputingShareMaxWallTime: 999999999

Then these variables are modified by the local batch system information pro=
vider, which uses batch system configuration variables to populate the BDII=
:

In case of Torque:

GLUE2ComputingShareMaxWallTime: resources_default.walltime if defined, else=
 resources_max.walltime
GLUE2ComputingShareMaxCPUTime: resources_default.cput if defined, else reso=
urces_max.cput

In case of LSF:

GLUE2ComputingShareMaxWallTime: RUNLIMIT
GLUE2ComputingShareMaxCPUTime: CPULIMIT

I don't know how this is done for other batch systems, I'm happy to learn f=
rom the relevant people writing other information providers.

So the question is, are sites defining the relevant variables in their loca=
l batch system configuration? If yes, could it be that they are running old=
 versions of the information providers and that is the reason why the defau=
lt values are published? AFAIK, the EMI 3 versions of info-dynamic-schedule=
r-pbs and info-dynamic-scheduler-lsf use the mentioned variables (not sure =
if this is also the case for earlier versions). Could it also be that the 9=
99999999 values are defined on purpose meaning there is no limit for the CP=
U time in the corresponding queue?

The MaxCPUTime variable is used by LHCb to calculate the queue length. It i=
s therefore very important that there is a correct value defined there. Cou=
ld some sys admins comment on how they are doing this and whether they publ=
ish 999999999 on purpose?

The list of sites that are publishing default values is included below. Tha=
nks very much in advance for the feedback!

Thanks!
Maria

ldapsearch -LLL -x -h localhost -p 2170 -b GLUE2GroupID=3Dgrid,o=3Dglue '(&=
(objectClass=3DGLUE2ComputingShare)(GLUE2ComputingShareMaxCPUTime=3D9999999=
99))' | perl -p00e 's/\r?\n //g' | grep dn: | cut -d"=3D" -f5 | cut -d"," -=
f1 | sort | uniq
BelGrid-UCL
CA-ALBERTA-WESTGRID-T2
CA-SCINET-T2
CA-VICTORIA-WESTGRID-T2
CERN-PROD
CYFRONET-LCG2
DESY-HH
FZK-LCG2
GoeGrid
GR-07-UOI-HEPLAB
ICM
IFCA-LCG2
INFN-LNL-2
INFN-NAPOLI-ATLAS
INFN-PISA
INFN-T1
LIP-Coimbra
LIP-Lisbon
LRZ-LMU
NCG-INGRID-PT
praguelcg2
PSNC
ru-PNPI
RU-SPbSU
SFU-LCG2
SiGNET
TR-03-METU
TR-10-ULAKBIM
TRIUMF-LCG2
UA-BITP
UA-KNU
UKI-NORTHGRID-LIV-HEP
UKI-NORTHGRID-SHEF-HEP
UKI-SOUTHGRID-RALPP
UNI-FREIBURG
wuppertalprod
=========================================================================
Date:         Tue, 28 May 2013 17:06:47 +0100
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Stephen Jones <[log in to unmask]>
Subject:      Re: Publication of MaxCPUTime and MaxWallTime glue attributes
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Message-ID:  <[log in to unmask]>

Hi Maria,

On 05/28/2013 04:08 PM, Maria Alandes Pradillo wrote:
> UKI-NORTHGRID-LIV-HEP

Our site BDII emits nothing whatsoever related to hepgrid97, e.g.:

$ ldapsearch -LLL -x -h hepgrid4.ph.liv.ac.uk   -p 2170 -b 
GLUE2GroupID=grid,o=glue | grep hepgrid97

Yet a query to some top level BDII says plenty:

$ ldapsearch -LLL -x -h lcg-bdii.cern.ch -p 2170 -b 
GLUE2GroupID=grid,o=glue 
'(&(objectClass=GLUE2ComputingShare)(GLUE2ComputingShareMaxCPUTime=999999999))' 
| perl -p00e 's/\r?\n //g' | grep hepgrid97 | wc
     350     700   30065

Hm... that's mysterious. How can a top level BDII get at this data, when 
the
machine itself is not hooked into our BDII? I'll have a good look at 
this, and
let you know.

Steve

-- 
Steve Jones                             [log in to unmask]
System Administrator                    office: 220
High Energy Physics Division            tel (int): 42334
Oliver Lodge Laboratory                 tel (ext): +44 (0)151 794 2334
University of Liverpool                 http://www.liv.ac.uk/physics/hep/
=========================================================================
Date:         Wed, 29 May 2013 09:41:44 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Alessandro Paolini <[log in to unmask]>
Subject:      Re: Publication of MaxCPUTime and MaxWallTime glue attributes
Comments: cc: Stephen Jones <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: multipart/alternative;
              boundary="------------090905090307090007030906"
Message-ID:  <[log in to unmask]>

This is a multi-part message in MIME format.
--------------090905090307090007030906
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: quoted-printable
X-MIME-Autoconverted: from 8bit to quoted-printable by smtpauth1.cnaf.infn.it id r4T7fAHs018433

Il 28/05/2013 18:06, Stephen Jones ha scritto:
> Hi Maria,
>
> On 05/28/2013 04:08 PM, Maria Alandes Pradillo wrote:
>> UKI-NORTHGRID-LIV-HEP
>
> Our site BDII emits nothing whatsoever related to hepgrid97, e.g.:
>
> $ ldapsearch -LLL -x -h hepgrid4.ph.liv.ac.uk   -p 2170 -b=20
> GLUE2GroupID=3Dgrid,o=3Dglue | grep hepgrid97
>
> Yet a query to some top level BDII says plenty:
>
> $ ldapsearch -LLL -x -h lcg-bdii.cern.ch -p 2170 -b=20
> GLUE2GroupID=3Dgrid,o=3Dglue=20
> '(&(objectClass=3DGLUE2ComputingShare)(GLUE2ComputingShareMaxCPUTime=3D=
999999999))'=20
> | perl -p00e 's/\r?\n //g' | grep hepgrid97 | wc
>     350     700   30065
>
> Hm... that's mysterious. How can a top level BDII get at this data,=20
> when the
> machine itself is not hooked into our BDII? I'll have a good look at=20
> this, and
> let you know.
>
> Steve
>
hi Steve,
in case of site-bdii, you should use the branch "-b=20
GLUE2DomainID=3DUKI-NORTHGRID-LIV-HEP":

$ ldapsearch -LLL -x -h hepgrid4.ph.liv.ac.uk   -p 2170 -b=20
GLUE2DomainID=3DUKI-NORTHGRID-LIV-HEP,o=3Dglue | grep hepgrid97
[...]
GLUE2EntityOtherInfo: InfoProviderHost=3Dhepgrid97.ph.liv.ac.uk
  -opt_hepgrid97.ph.liv.ac.uk,GLUE2ResourceID=3Dhepgrid97.ph.liv.ac.uk,GL=
UE2Servi
  ceID=3Dhepgrid97.ph.liv.ac.uk_ComputingElement,GLUE2GroupID=3Dresource,=
GLUE2Domai
  t_hepgrid97.ph.liv.ac.uk
GLUE2ApplicationEnvironmentComputingManagerForeignKey:=20
hepgrid97.ph.liv.ac.uk_
GLUE2EntityOtherInfo: InfoProviderHost=3Dhepgrid97.ph.liv.ac.uk
  -opt_hepgrid97.ph.liv.ac.uk,GLUE2ResourceID=3Dhepgrid97.ph.liv.ac.uk,GL=
UE2Servi
  ceID=3Dhepgrid97.ph.liv.ac.uk_ComputingElement,GLUE2GroupID=3Dresource,=
GLUE2Domai
  t_hepgrid97.ph.liv.ac.uk
[...]

Cheers,
Alessandro

--=20
Dr. Alessandro Paolini
INFN - CNAF
Viale Berti Pichat 6/2
40127 Bologna
Italy
tel: +39 051 6092723
fax: +39 051 6092916
ICQ: 192172027
skype: alex.paolini
**********************
"credo nel potere del riso e delle lacrime"
    "come antidoto all'odio ed al terrore"
         "un giorno senza un sorriso"
              "=E8 un giorno perso" >>> Charlie Chaplin


--------------090905090307090007030906
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body text="#FFFFFF" bgcolor="#666666">
    <div class="moz-cite-prefix">Il 28/05/2013 18:06, Stephen Jones ha
      scritto:<br>
    </div>
    <blockquote cite="mid:[log in to unmask]" type="cite">Hi
      Maria,
      <br>
      <br>
      On 05/28/2013 04:08 PM, Maria Alandes Pradillo wrote:
      <br>
      <blockquote type="cite">UKI-NORTHGRID-LIV-HEP
        <br>
      </blockquote>
      <br>
      Our site BDII emits nothing whatsoever related to hepgrid97, e.g.:
      <br>
      <br>
      $ ldapsearch -LLL -x -h hepgrid4.ph.liv.ac.uk&nbsp;&nbsp; -p 2170 -b
      GLUE2GroupID=grid,o=glue | grep hepgrid97
      <br>
      <br>
      Yet a query to some top level BDII says plenty:
      <br>
      <br>
      $ ldapsearch -LLL -x -h lcg-bdii.cern.ch -p 2170 -b
      GLUE2GroupID=grid,o=glue
      '(&amp;(objectClass=GLUE2ComputingShare)(GLUE2ComputingShareMaxCPUTime=999999999))'
      | perl -p00e 's/\r?\n //g' | grep hepgrid97 | wc
      <br>
      &nbsp;&nbsp;&nbsp; 350&nbsp;&nbsp;&nbsp;&nbsp; 700&nbsp;&nbsp; 30065
      <br>
      <br>
      Hm... that's mysterious. How can a top level BDII get at this
      data, when the
      <br>
      machine itself is not hooked into our BDII? I'll have a good look
      at this, and
      <br>
      let you know.
      <br>
      <br>
      Steve
      <br>
      <br>
    </blockquote>
    hi Steve,<br>
    in case of site-bdii, you should use the branch "-b
    GLUE2DomainID=UKI-NORTHGRID-LIV-HEP":<br>
    <br>
    $ ldapsearch -LLL -x -h hepgrid4.ph.liv.ac.uk&nbsp;&nbsp; -p 2170 -b
    GLUE2DomainID=UKI-NORTHGRID-LIV-HEP,o=glue | grep hepgrid97<br>
    [...]<br>
    GLUE2EntityOtherInfo: InfoProviderHost=hepgrid97.ph.liv.ac.uk<br>
&nbsp;-opt_hepgrid97.ph.liv.ac.uk,GLUE2ResourceID=hepgrid97.ph.liv.ac.uk,GLUE2Servi<br>
&nbsp;ceID=hepgrid97.ph.liv.ac.uk_ComputingElement,GLUE2GroupID=resource,GLUE2Domai<br>
    &nbsp;t_hepgrid97.ph.liv.ac.uk<br>
    GLUE2ApplicationEnvironmentComputingManagerForeignKey:
    hepgrid97.ph.liv.ac.uk_<br>
    GLUE2EntityOtherInfo: InfoProviderHost=hepgrid97.ph.liv.ac.uk<br>
&nbsp;-opt_hepgrid97.ph.liv.ac.uk,GLUE2ResourceID=hepgrid97.ph.liv.ac.uk,GLUE2Servi<br>
&nbsp;ceID=hepgrid97.ph.liv.ac.uk_ComputingElement,GLUE2GroupID=resource,GLUE2Domai<br>
    &nbsp;t_hepgrid97.ph.liv.ac.uk<br>
    [...]<br>
    <br>
    Cheers,<br>
    Alessandro<br>
    <pre class="moz-signature" cols="72">-- 
Dr. Alessandro Paolini
INFN - CNAF
Viale Berti Pichat 6/2
40127 Bologna
Italy
tel: +39 051 6092723
fax: +39 051 6092916
ICQ: 192172027
skype: alex.paolini
**********************
"credo nel potere del riso e delle lacrime"
   "come antidoto all'odio ed al terrore"
        "un giorno senza un sorriso"
             "&egrave; un giorno perso" &gt;&gt;&gt; Charlie Chaplin</pre>
  </body>
</html>

--------------090905090307090007030906--
=========================================================================
Date:         Wed, 29 May 2013 11:03:13 +0100
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Stephen Jones <[log in to unmask]>
Subject:      Re: Publication of MaxCPUTime and MaxWallTime glue attributes
Comments: cc: Maria Alandes Pradillo <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Message-ID:  <[log in to unmask]>

Hi Maria,

I don't know the answer yet, but I'm still looking.

I've been digging around  bit here, and here is  just one interesting fact.
When I run the plugin on an EMI2 CE, I get this:

root@hepgrid10 ~]# /usr/libexec/info-dynamic-pbs /etc/lrms/pbs.conf | 
grep MaxCPU | tail -1

GLUE2ComputingShareMaxCPUTime: 2880

Yet when I run the same on an EMI3 CE, I get this:

[root@hepgrid97 gip]# /usr/libexec/info-dynamic-pbs /etc/lrms/pbs.conf | 
grep MaxCPU | tail -1

GLUE2ComputingShareMaxCPUTime: 172800

And 2880 * 60 is 172800, so they've switched it to seconds - is that 
weird and pointless? I dunno.

I'll keep looking.


Steve



On 05/28/2013 04:08 PM, Maria Alandes Pradillo wrote:
> Dear all,
>
> I'm trying to understand why there are so many WLCG sites publishing the default values of GLUE2ComputingShareMaxCPUTime and (less cases) GLUE2ComputingShareMaxWallTime.
>
> In the case of CREAM, these two variables are published by default as:
>
> GLUE2ComputingShareMaxCPUTime: 999999999
> GLUE2ComputingShareMaxWallTime: 999999999
>
> Then these variables are modified by the local batch system information provider, which uses batch system configuration variables to populate the BDII:
>
> In case of Torque:
>
> GLUE2ComputingShareMaxWallTime: resources_default.walltime if defined, else resources_max.walltime
> GLUE2ComputingShareMaxCPUTime: resources_default.cput if defined, else resources_max.cput
>
> In case of LSF:
>
> GLUE2ComputingShareMaxWallTime: RUNLIMIT
> GLUE2ComputingShareMaxCPUTime: CPULIMIT
>
> I don't know how this is done for other batch systems, I'm happy to learn from the relevant people writing other information providers.
>
> So the question is, are sites defining the relevant variables in their local batch system configuration? If yes, could it be that they are running old versions of the information providers and that is the reason why the default values are published? AFAIK, the EMI 3 versions of info-dynamic-scheduler-pbs and info-dynamic-scheduler-lsf use the mentioned variables (not sure if this is also the case for earlier versions). Could it also be that the 999999999 values are defined on purpose meaning there is no limit for the CPU time in the corresponding queue?
>
> The MaxCPUTime variable is used by LHCb to calculate the queue length. It is therefore very important that there is a correct value defined there. Could some sys admins comment on how they are doing this and whether they publish 999999999 on purpose?
>
> The list of sites that are publishing default values is included below. Thanks very much in advance for the feedback!
>
> Thanks!
> Maria
>
> ldapsearch -LLL -x -h localhost -p 2170 -b GLUE2GroupID=grid,o=glue '(&(objectClass=GLUE2ComputingShare)(GLUE2ComputingShareMaxCPUTime=999999999))' | perl -p00e 's/\r?\n //g' | grep dn: | cut -d"=" -f5 | cut -d"," -f1 | sort | uniq
> BelGrid-UCL
> CA-ALBERTA-WESTGRID-T2
> CA-SCINET-T2
> CA-VICTORIA-WESTGRID-T2
> CERN-PROD
> CYFRONET-LCG2
> DESY-HH
> FZK-LCG2
> GoeGrid
> GR-07-UOI-HEPLAB
> ICM
> IFCA-LCG2
> INFN-LNL-2
> INFN-NAPOLI-ATLAS
> INFN-PISA
> INFN-T1
> LIP-Coimbra
> LIP-Lisbon
> LRZ-LMU
> NCG-INGRID-PT
> praguelcg2
> PSNC
> ru-PNPI
> RU-SPbSU
> SFU-LCG2
> SiGNET
> TR-03-METU
> TR-10-ULAKBIM
> TRIUMF-LCG2
> UA-BITP
> UA-KNU
> UKI-NORTHGRID-LIV-HEP
> UKI-NORTHGRID-SHEF-HEP
> UKI-SOUTHGRID-RALPP
> UNI-FREIBURG
> wuppertalprod


-- 
Steve Jones                             [log in to unmask]
System Administrator                    office: 220
High Energy Physics Division            tel (int): 42334
Oliver Lodge Laboratory                 tel (ext): +44 (0)151 794 2334
University of Liverpool                 http://www.liv.ac.uk/physics/hep/
=========================================================================
Date:         Wed, 29 May 2013 12:13:32 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Oxana Smirnova <[log in to unmask]>
Organization: Lund University
Subject:      Re: Publication of MaxCPUTime and MaxWallTime glue attributes
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: multipart/signed; protocol="application/pkcs7-signature";
              micalg=sha1; boundary="------------ms090701070509060302020402"
Message-ID:  <[log in to unmask]>

This is a cryptographically signed message in MIME format.

--------------ms090701070509060302020402
Content-Type: multipart/mixed;
 boundary="------------010602050804080706000600"

This is a multi-part message in MIME format.
--------------010602050804080706000600
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: quoted-printable

Hi,

Stephen Burke would know for sure, but as far as I understand the reason =

for all these oddities is that the GLUE2 specification, and LDAP=20
rendering in particular, was still in a changing state during EMI, and I =

am sure there are still some places to polish. The 999999999 figure=20
probably is the suggested default in the specs, which in some cases is=20
not overwritten by the actual values (likely when the actual values are=20
unknown).

Cheers,
Oxana

On 29.05.2013 12:03, Stephen Jones wrote:
> Hi Maria,
>
> I don't know the answer yet, but I'm still looking.
>
> I've been digging around  bit here, and here is  just one interesting=20
> fact.
> When I run the plugin on an EMI2 CE, I get this:
>
> root@hepgrid10 ~]# /usr/libexec/info-dynamic-pbs /etc/lrms/pbs.conf |=20
> grep MaxCPU | tail -1
>
> GLUE2ComputingShareMaxCPUTime: 2880
>
> Yet when I run the same on an EMI3 CE, I get this:
>
> [root@hepgrid97 gip]# /usr/libexec/info-dynamic-pbs /etc/lrms/pbs.conf =

> | grep MaxCPU | tail -1
>
> GLUE2ComputingShareMaxCPUTime: 172800
>
> And 2880 * 60 is 172800, so they've switched it to seconds - is that=20
> weird and pointless? I dunno.
>
> I'll keep looking.
>
>
> Steve
>
>
>
> On 05/28/2013 04:08 PM, Maria Alandes Pradillo wrote:
>> Dear all,
>>
>> I'm trying to understand why there are so many WLCG sites publishing=20
>> the default values of GLUE2ComputingShareMaxCPUTime and (less cases)=20
>> GLUE2ComputingShareMaxWallTime.
>>
>> In the case of CREAM, these two variables are published by default as:=

>>
>> GLUE2ComputingShareMaxCPUTime:
>> GLUE2ComputingShareMaxWallTime: 999999999
>>
>> Then these variables are modified by the local batch system=20
>> information provider, which uses batch system configuration variables =

>> to populate the BDII:
>>
>> In case of Torque:
>>
>> GLUE2ComputingShareMaxWallTime: resources_default.walltime if=20
>> defined, else resources_max.walltime
>> GLUE2ComputingShareMaxCPUTime: resources_default.cput if defined,=20
>> else resources_max.cput
>>
>> In case of LSF:
>>
>> GLUE2ComputingShareMaxWallTime: RUNLIMIT
>> GLUE2ComputingShareMaxCPUTime: CPULIMIT
>>
>> I don't know how this is done for other batch systems, I'm happy to=20
>> learn from the relevant people writing other information providers.
>>
>> So the question is, are sites defining the relevant variables in=20
>> their local batch system configuration? If yes, could it be that they =

>> are running old versions of the information providers and that is the =

>> reason why the default values are published? AFAIK, the EMI 3=20
>> versions of info-dynamic-scheduler-pbs and info-dynamic-scheduler-lsf =

>> use the mentioned variables (not sure if this is also the case for=20
>> earlier versions). Could it also be that the 999999999 values are=20
>> defined on purpose meaning there is no limit for the CPU time in the=20
>> corresponding queue?
>>
>> The MaxCPUTime variable is used by LHCb to calculate the queue=20
>> length. It is therefore very important that there is a correct value=20
>> defined there. Could some sys admins comment on how they are doing=20
>> this and whether they publish 999999999 on purpose?
>>
>> The list of sites that are publishing default values is included=20
>> below. Thanks very much in advance for the feedback!
>>
>> Thanks!
>> Maria
>>
>> ldapsearch -LLL -x -h localhost -p 2170 -b GLUE2GroupID=3Dgrid,o=3Dglu=
e=20
>> '(&(objectClass=3DGLUE2ComputingShare)(GLUE2ComputingShareMaxCPUTime=3D=
999999999))'=20
>> | perl -p00e 's/\r?\n //g' | grep dn: | cut -d"=3D" -f5 | cut -d"," -f=
1=20
>> | sort | uniq
>> BelGrid-UCL
>> CA-ALBERTA-WESTGRID-T2
>> CA-SCINET-T2
>> CA-VICTORIA-WESTGRID-T2
>> CERN-PROD
>> CYFRONET-LCG2
>> DESY-HH
>> FZK-LCG2
>> GoeGrid
>> GR-07-UOI-HEPLAB
>> ICM
>> IFCA-LCG2
>> INFN-LNL-2
>> INFN-NAPOLI-ATLAS
>> INFN-PISA
>> INFN-T1
>> LIP-Coimbra
>> LIP-Lisbon
>> LRZ-LMU
>> NCG-INGRID-PT
>> praguelcg2
>> PSNC
>> ru-PNPI
>> RU-SPbSU
>> SFU-LCG2
>> SiGNET
>> TR-03-METU
>> TR-10-ULAKBIM
>> TRIUMF-LCG2
>> UA-BITP
>> UA-KNU
>> UKI-NORTHGRID-LIV-HEP
>> UKI-NORTHGRID-SHEF-HEP
>> UKI-SOUTHGRID-RALPP
>> UNI-FREIBURG
>> wuppertalprod
>
>


--------------010602050804080706000600
Content-Type: text/x-vcard; charset=utf-8;
 name="oxana_smirnova.vcf"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
 filename="oxana_smirnova.vcf"

YmVnaW46dmNhcmQNCmZuOk94YW5hIFNtaXJub3ZhDQpuOlNtaXJub3ZhO094YW5hDQpvcmc6
THVuZCBVbml2ZXJzaXR5DQphZHI6Qm94IDExODs7RGVwYXJ0bWVudCBvZiBQaHlzaWNzLCBQ
YXJ0aWNsZSBQaHlzaWNzO0x1bmQ7OzIyMTAwO1N3ZWRlbg0KZW1haWw7aW50ZXJuZXQ6b3hh
bmEuc21pcm5vdmFAaGVwLmx1LnNlDQp0ZWw7d29yazorNDY0NjIyMjc2OTkNCnRlbDtmYXg6
KzQ2NDYyMjI0MDE1DQp0ZWw7Y2VsbDorNDY3MDkyMjQ2NTcNCnZlcnNpb246Mi4xDQplbmQ6
dmNhcmQNCg0K
--------------010602050804080706000600--

--------------ms090701070509060302020402
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature

MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJ2TCC
BNowggPCoAMCAQICEF3/UOr+D1NGiJ+AQY/nQsgwDQYJKoZIhvcNAQEFBQAwga4xCzAJBgNV
BAYTAlVTMQswCQYDVQQIEwJVVDEXMBUGA1UEBxMOU2FsdCBMYWtlIENpdHkxHjAcBgNVBAoT
FVRoZSBVU0VSVFJVU1QgTmV0d29yazEhMB8GA1UECxMYaHR0cDovL3d3dy51c2VydHJ1c3Qu
Y29tMTYwNAYDVQQDEy1VVE4tVVNFUkZpcnN0LUNsaWVudCBBdXRoZW50aWNhdGlvbiBhbmQg
RW1haWwwHhcNMDkwNTE4MDAwMDAwWhcNMjgxMjMxMjM1OTU5WjBEMQswCQYDVQQGEwJOTDEP
MA0GA1UEChMGVEVSRU5BMSQwIgYDVQQDExtURVJFTkEgZVNjaWVuY2UgUGVyc29uYWwgQ0Ew
ggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDEvCV9ze9ZBKt0Jym2Y4rvxDVwoUYI
r25QmxtxeJcChcA1/5AYEWb6MzkUnqQS9z3uBSyB5/ctLjZ4Qw45tn96dMPLVT8vs8vDN2rh
lQMjG7MimBODWDBfrDyRRdNtCy7L0ZFhsspx+sKJDHsJ1pK/o4EXEVUg3zeBnx4mCR3SFxlD
y4uZJXRBW+YyKtCnU9zSIY3Nkc97bPTJ/tDl0gZrGmFCG4CRB9wUSSc7Coqy4jtpCITLgZnX
uhQf2H6SY732LTH6lU7NNg2Z7xG6rUr7qMR4uXxaPsj1CDLAqQXHjBoojzC6F6PxueIHp7j7
Vb+fYlrVFA8ItBOZ0hJ5WvD9AgMBAAGjggFbMIIBVzAfBgNVHSMEGDAWgBSJgmd9xJ0mcABL
tFBIfN49rgRufTAdBgNVHQ4EFgQUyIlzmaddURZTRVR8o8I5fMvXqoEwDgYDVR0PAQH/BAQD
AgEGMBIGA1UdEwEB/wQIMAYBAf8CAQAwJgYDVR0gBB8wHTANBgsrBgEEAbIxAQICHTAMBgoq
hkiG90wFAgIFMFgGA1UdHwRRME8wTaBLoEmGR2h0dHA6Ly9jcmwudXNlcnRydXN0LmNvbS9V
VE4tVVNFUkZpcnN0LUNsaWVudEF1dGhlbnRpY2F0aW9uYW5kRW1haWwuY3JsMG8GCCsGAQUF
BwEBBGMwYTA4BggrBgEFBQcwAoYsaHR0cDovL2NydC51c2VydHJ1c3QuY29tL1VUTkFBQUNs
aWVudF9DQS5jcnQwJQYIKwYBBQUHMAGGGWh0dHA6Ly9vY3NwLnVzZXJ0cnVzdC5jb20wDQYJ
KoZIhvcNAQEFBQADggEBAAgXpBz5FWuwGWFvoEjjeiTvQVWaoFBw2CPVU4ZKZ47o2lYWGCwb
GCGJupgk7lY04xeGJr0hWtQZk0rqYXRNtsSEjUfuyi5lbTaTmLHikmaI4k57dcdeRGkh3BJq
MPxhgP4P8J3S3H6u5cJTTQtwg2FWRfs933L2AkJ164iKmFdg9Z+ickmxej5BZzXDVSsNBzXo
xivVuod5gHTnkja9RoF6Liniar7hFxM1fBakJTMvYe7OyVLgQNvTvjlaz89MFOV/xUNXi025
Wo7CDwZN3shJnYFzuuQ/mKWTPFlO25s13/5Nv1Wh6WTDRXysj3xH0TrxTnXZkGdA6LEvnhcv
XDUwggT3MIID36ADAgECAhBdz4M7qf5LvC6b7od6O3xqMA0GCSqGSIb3DQEBBQUAMEQxCzAJ
BgNVBAYTAk5MMQ8wDQYDVQQKEwZURVJFTkExJDAiBgNVBAMTG1RFUkVOQSBlU2NpZW5jZSBQ
ZXJzb25hbCBDQTAeFw0xMjA3MDgwMDAwMDBaFw0xMzA4MDcyMzU5NTlaMIGTMRMwEQYKCZIm
iZPyLGQBGRYDb3JnMRYwFAYKCZImiZPyLGQBGRYGdGVyZW5hMRMwEQYKCZImiZPyLGQBGRYD
dGNzMQswCQYDVQQGEwJTRTEaMBgGA1UEChMRTHVuZHMgVW5pdmVyc2l0ZXQxJjAkBgNVBAMU
HU94YW5hIFNtaXJub3ZhIHF1YXItb3NtQGx1LnNlMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8A
MIIBCgKCAQEAmZT7XUti5GZBABFJuiz2XUVCbI3p5yDjlWEL6OPeF41mznPR1mjTOoomoc4q
2SDr29cIGpGlfcF5oUwWiM2TwplNvQ6x/jf7KMmRd9ZOoLXQ3xRj32a4OPumqOZ1iRfUy/Wi
E441RtThevqF0iDdTxpuW4zsQq/6N/tHmgbDeYasEsztKCtBVeaELpOiP729WIu3gGCwMqt1
KZiHWxK6rxMO4ZGFIT+G4xFh2EKD1FtE/ghKhk3gLPALgVWKKwCXYERVkYHlEDOjQyU5PDhS
zD4Kd4ib1ETWe29JvYAUVGk0uADPQYRvpP8Bu6tOpi5JtPx8OwrQLzKTLYycnnm5FQIDAQAB
o4IBkzCCAY8wHwYDVR0jBBgwFoAUyIlzmaddURZTRVR8o8I5fMvXqoEwHQYDVR0OBBYEFAvm
UkxDmD6NEFRM8AjsNyCKV3k2MA4GA1UdDwEB/wQEAwIFoDAMBgNVHRMBAf8EAjAAMB0GA1Ud
JQQWMBQGCCsGAQUFBwMEBggrBgEFBQcDAjAmBgNVHSAEHzAdMA0GCysGAQQBsjEBAgIdMAwG
CiqGSIb3TAUCAgUwRwYDVR0fBEAwPjA8oDqgOIY2aHR0cDovL2NybC50Y3MudGVyZW5hLm9y
Zy9URVJFTkFlU2NpZW5jZVBlcnNvbmFsQ0EuY3JsMHoGCCsGAQUFBwEBBG4wbDBCBggrBgEF
BQcwAoY2aHR0cDovL2NydC50Y3MudGVyZW5hLm9yZy9URVJFTkFlU2NpZW5jZVBlcnNvbmFs
Q0EuY3J0MCYGCCsGAQUFBzABhhpodHRwOi8vb2NzcC50Y3MudGVyZW5hLm9yZzAjBgNVHREE
HDAagRhPeGFuYS5TbWlybm92YUBoZXAubHUuc2UwDQYJKoZIhvcNAQEFBQADggEBADtf6YH9
3leQPFCdiP0U1eejagCR/EpQLhnuuI1bu363nMVN3cR547uta35DPGGeKlCTzYzWoX22l9Nt
YVbyVvJ6f/L4wplJr5oDJLa1zkBVLtT5yEBAyzFN31NwL7MPZ/gozpBuk4MXsi9b+nLJbObi
i2B+89VqaFkkBSU1Wd1jERSJNbcMC5v74jQM8R4kC9T8vq5DAcedgifpyD235SDHONBCCqBt
1v//QeHkII3s0pe9YKxfg1DQatU61HU61ICXjqGIPcv78VU0BB+d3it625yqSVVLA926tw3i
pbiRrsYW8GUp4XcA/6y9u4mKBHdNouzBrakcmn8tMwYmttExggMiMIIDHgIBATBYMEQxCzAJ
BgNVBAYTAk5MMQ8wDQYDVQQKEwZURVJFTkExJDAiBgNVBAMTG1RFUkVOQSBlU2NpZW5jZSBQ
ZXJzb25hbCBDQQIQXc+DO6n+S7wum+6Hejt8ajAJBgUrDgMCGgUAoIIBnzAYBgkqhkiG9w0B
CQMxCwYJKoZIhvcNAQcBMBwGCSqGSIb3DQEJBTEPFw0xMzA1MjkxMDEzMzJaMCMGCSqGSIb3
DQEJBDEWBBRKbkcI3p7iAAhuOPSqiuB9flGowzBnBgkrBgEEAYI3EAQxWjBYMEQxCzAJBgNV
BAYTAk5MMQ8wDQYDVQQKEwZURVJFTkExJDAiBgNVBAMTG1RFUkVOQSBlU2NpZW5jZSBQZXJz
b25hbCBDQQIQXc+DO6n+S7wum+6Hejt8ajBpBgsqhkiG9w0BCRACCzFaoFgwRDELMAkGA1UE
BhMCTkwxDzANBgNVBAoTBlRFUkVOQTEkMCIGA1UEAxMbVEVSRU5BIGVTY2llbmNlIFBlcnNv
bmFsIENBAhBdz4M7qf5LvC6b7od6O3xqMGwGCSqGSIb3DQEJDzFfMF0wCwYJYIZIAWUDBAEq
MAsGCWCGSAFlAwQBAjAKBggqhkiG9w0DBzAOBggqhkiG9w0DAgICAIAwDQYIKoZIhvcNAwIC
AUAwBwYFKw4DAgcwDQYIKoZIhvcNAwICASgwDQYJKoZIhvcNAQEBBQAEggEAea26Xt0MVyTl
ksalxQPkSS7wxLayOrxd2PaM3CcQVofY3VNEq62L0JsgsoggDAeXi0RRzEY1hvh6xlKfsT/w
rcWiB+DRByTNRUKVJGUZXsuysmXxbMHDV5w+jvT4BDx0KzC2DY+299Y8g1pUJ014wVMGuTLy
IJQ0D3eMi6IfZPlz0pE7Ek2BBjyRfKqiM27LTOUd4WEjpzGhDbU9ZrkqSrV8OtS7q5iYWzTT
cKziGmpvY9kePZ9NSnnAAVQE1Yv9oMKGFFzFaG61/nbZy1Z72LwaN+j2R1elzsrPcUaZI0dt
HM0vewMjkdwocF7OSTZjO4ivdyhRZlrBiC+Eb/nfMgAAAAAAAA==
--------------ms090701070509060302020402--
=========================================================================
Date:         Wed, 29 May 2013 10:13:52 +0000
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Stephen Burke <[log in to unmask]>
Subject:      Re: Publication of MaxCPUTime and MaxWallTime glue attributes
In-Reply-To:  <[log in to unmask]>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Message-ID:  <[log in to unmask]>

LHC Computer Grid - Rollout [mailto:[log in to unmask]] said:
> And 2880 * 60 is 172800, so they've switched it to seconds - is that weir=
d and
> pointless? I dunno.

Yes, it has switched to seconds in GLUE 2 - whether weird and pointless I'm=
 not sure. In GLUE 1 it's inconsistent, some things are in seconds and some=
 in minutes, so in GLUE 2 it's standardised on seconds for everything, but =
the first CREAM release still had minutes.

The most fundamental question is what you think you've configured in the ba=
tch system - do you have CPU and Walltime limits, and if so what are they?

Stephen


-- 
Scanned by iCritical.
=========================================================================
Date:         Wed, 29 May 2013 10:16:46 +0000
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Stephen Burke <[log in to unmask]>
Subject:      Re: Publication of MaxCPUTime and MaxWallTime glue attributes
In-Reply-To:  <[log in to unmask]>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Message-ID:  <[log in to unmask]>

TEhDIENvbXB1dGVyIEdyaWQgLSBSb2xsb3V0IFttYWlsdG86TENHLVJPTExPVVRASklTQ01BSUwu
QUMuVUtdDQo+IE9uIEJlaGFsZiBPZiBPeGFuYSBTbWlybm92YSBzYWlkOg0KPiBUaGUgOTk5OTk5
OTk5IGZpZ3VyZQ0KPiBwcm9iYWJseSBpcyB0aGUgc3VnZ2VzdGVkIGRlZmF1bHQgaW4gdGhlIHNw
ZWNzLCB3aGljaCBpbiBzb21lIGNhc2VzIGlzDQo+IG5vdCBvdmVyd3JpdHRlbiBieSB0aGUgYWN0
dWFsIHZhbHVlcyAobGlrZWx5IHdoZW4gdGhlIGFjdHVhbCB2YWx1ZXMgYXJlDQo+IHVua25vd24p
Lg0KDQpUaGVyZSBhcmUgdHdvIHBvc3NpYmxlIHJlYXNvbnMgdG8gZ2V0IHRoZSAiYWxsIG5pbmVz
IiB2YWx1ZTogZWl0aGVyIHRoZXJlJ3MgYSBidWcgaW4gdGhlIGluZm8gcHJvdmlkZXIsIG9yIHRo
ZSBiYXRjaCBzeXN0ZW0gcmVhbGx5IGhhcyBubyBsaW1pdCBzZXQgaW4gd2hpY2ggY2FzZSB0aGF0
IHZhbHVlIGVmZmVjdGl2ZWx5IG1lYW5zIGluZmluaXRlLiBXaGF0IHdlJ3JlIHRyeWluZyB0byBl
c3RhYmxpc2ggaXMgd2hldGhlciBzb21lIHNpdGVzIGludGVudGlvbmFsbHkgZG9uJ3Qgc2V0IGEg
dGltZSBsaW1pdCwgYW5kIGlmIHNvIHdoYXQgdGhlIHJhdGlvbmFsZSBpcy4NCg0KU3RlcGhlbg0K
DQo=
=========================================================================
Date:         Wed, 29 May 2013 11:54:25 +0100
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Stephen Jones <[log in to unmask]>
Subject:      Re: Publication of MaxCPUTime and MaxWallTime glue attributes
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Message-ID:  <[log in to unmask]>

Hi Stephen,

I think we will get to the bottom of this now, as I'll show.

On 05/29/2013 11:13 AM, Stephen Burke wrote:
> LHC Computer Grid - Rollout [mailto:[log in to unmask]] said:
>> And 2880 * 60 is 172800, so they've switched it to seconds - is that weird and pointless? I dunno.
> Yes, it has switched to seconds in GLUE 2 - whether weird and pointless I'm not sure. In GLUE 1 it's
> inconsistent, some things are in seconds and some in minutes, so in GLUE 2 it's standardised
> on seconds for everything, but the first CREAM release still had minutes.

It's a good idea but it made me think.

> what you think you've configured in the batch system - do you have CPU and Walltime limits

That brings us  to the second finding. I was going to do a bit more work 
to confirm this,
but here is my suspicion, which may confirm Maria's observation.

Background: we temporarily put up a new CE called hepgrid97, and put it 
in the site BDII. I'll
say no more about that, other than to say that it was also in the GOCDB 
as "not production".

Problem: when I removed said CE from our site BDII (i.e. from file: 
/etc/bdii/gip/site-urls.conf),
the records pertaining to the new system persisted for many days (or 
forever?), even
though it had  dried up as a data source when we removed it from 
/etc/bdii/gip/site-urls.conf.

I.e. SOME FALSE records were kept in a cache in the site bdii, even 
though the source had dried
up. That's wrong, because when I take something out of the BDII, I 
expect to "stay gone", not
to stay as a ghost.

We confirmed this with Maria's original query - it showed the ghost records
from Liverpool, and populated them with 999999999 records (obviously 
because the
source was gone). Maria's query (which no long shows LIV records, 
because hepgrid97
is back in the site BDII now) was:

$ ldapsearch -LLL -x -h lcg-bdii.gridpp.ac.uk  -p 2170 -b 
GLUE2GroupID=grid,o=glue 
'(&(objectClass=GLUE2ComputingShare)(GLUE2ComputingShareMaxCPUTime=999999999))' 
| perl -p00e 's/\r?\n //g' | grep dn: | cut -d"=" -f5 | cut -d"," -f1 | 
sort | uniq

I hope this helps   - there is nothing wrong when we took the  CE out of 
the site BDII, but the code should
be changed to discard stale "ghost" values, in my opinion - they are not 
helpful.

What do you think - should I do more more to confirm this? Or is it 
expected behaviour, or
something else I don't understand. Pls. let me know.

Cheers,

Steve








-- 
Steve Jones                             [log in to unmask]
System Administrator                    office: 220
High Energy Physics Division            tel (int): 42334
Oliver Lodge Laboratory                 tel (ext): +44 (0)151 794 2334
University of Liverpool                 http://www.liv.ac.uk/physics/hep/
=========================================================================
Date:         Wed, 29 May 2013 12:25:12 +0100
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Stephen Jones <[log in to unmask]>
Subject:      Re: Publication of MaxCPUTime and MaxWallTime glue attributes
Comments: cc: Stephen Burke <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Message-ID:  <[log in to unmask]>

On 05/29/2013 11:13 AM, Stephen Burke wrote:
> LHC Computer Grid - Rollout [mailto:[log in to unmask]] said:
>> And 2880 * 60 is 172800, so they've switched it to seconds - is that weird and
>> pointless? I dunno.
> Yes, it has switched to seconds in GLUE 2 - whether weird and pointless I'm not sure. In GLUE 1 it's inconsistent, some things are in seconds and some in minutes, so in GLUE 2 it's standardised on seconds for everything, but the first CREAM release still had minutes.
>
> The most fundamental question is what you think you've configured in the batch system - do you have CPU and Walltime limits, and if so what are they?
>
> Stephen
>
>

Hi Stephen,

Perhaps the old hepgrid97 data is cached in here /var/lib/bdii/old.ldif ?

[root@hepgrid4 ~]# grep hepgrid97 /var/lib/bdii/old.ldif | wc
    5871    8711  395566

That file contains old data from hepgrid97, even though I took hepgrid97 
CE out of the BDII a few minutes ago:

[root@hepgrid4 ~]# cat /etc/bdii/gip/site-urls.conf
CE  ldap://hepgrid5.ph.liv.ac.uk:2170/mds-vo-name=resource,o=grid
CE2  ldap://hepgrid6.ph.liv.ac.uk:2170/mds-vo-name=resource,o=grid
CE3  ldap://hepgrid10.ph.liv.ac.uk:2170/mds-vo-name=resource,o=grid
SE  ldap://hepgrid11.ph.liv.ac.uk:2170/mds-vo-name=resource,o=grid
SITE_BDII ldap://hepgrid4.ph.liv.ac.uk:2170/mds-vo-name=resource,o=grid
ARGUS  ldap://hepgrid9.ph.liv.ac.uk:2170/mds-vo-name=resource,o=grid

Why is that rubbishy old data still in there, I wonder?

Steve

-- 
Steve Jones                             [log in to unmask]
System Administrator                    office: 220
High Energy Physics Division            tel (int): 42334
Oliver Lodge Laboratory                 tel (ext): +44 (0)151 794 2334
University of Liverpool                 http://www.liv.ac.uk/physics/hep/
=========================================================================
Date:         Wed, 29 May 2013 13:51:37 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Alessandro Paolini <[log in to unmask]>
Subject:      Re: Publication of MaxCPUTime and MaxWallTime glue attributes
Comments: cc: Stephen Jones <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: multipart/alternative;
              boundary="------------010700010608080105050501"
Message-ID:  <[log in to unmask]>

This is a multi-part message in MIME format.
--------------010700010608080105050501
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: quoted-printable
X-MIME-Autoconverted: from 8bit to quoted-printable by smtpauth1.cnaf.infn.it id r4TBp2BN019366

Il 29/05/2013 13:25, Stephen Jones ha scritto:
> On 05/29/2013 11:13 AM, Stephen Burke wrote:
>> LHC Computer Grid - Rollout [mailto:[log in to unmask]] said:
>>> And 2880 * 60 is 172800, so they've switched it to seconds - is that=20
>>> weird and
>>> pointless? I dunno.
>> Yes, it has switched to seconds in GLUE 2 - whether weird and=20
>> pointless I'm not sure. In GLUE 1 it's inconsistent, some things are=20
>> in seconds and some in minutes, so in GLUE 2 it's standardised on=20
>> seconds for everything, but the first CREAM release still had minutes.
>>
>> The most fundamental question is what you think you've configured in=20
>> the batch system - do you have CPU and Walltime limits, and if so=20
>> what are they?
>>
>> Stephen
>>
>>
>
> Hi Stephen,
>
> Perhaps the old hepgrid97 data is cached in here /var/lib/bdii/old.ldif=
 ?
>
> [root@hepgrid4 ~]# grep hepgrid97 /var/lib/bdii/old.ldif | wc
>    5871    8711  395566
>
> That file contains old data from hepgrid97, even though I took=20
> hepgrid97 CE out of the BDII a few minutes ago:
>
> [root@hepgrid4 ~]# cat /etc/bdii/gip/site-urls.conf
> CE  ldap://hepgrid5.ph.liv.ac.uk:2170/mds-vo-name=3Dresource,o=3Dgrid
> CE2  ldap://hepgrid6.ph.liv.ac.uk:2170/mds-vo-name=3Dresource,o=3Dgrid
> CE3 ldap://hepgrid10.ph.liv.ac.uk:2170/mds-vo-name=3Dresource,o=3Dgrid
> SE  ldap://hepgrid11.ph.liv.ac.uk:2170/mds-vo-name=3Dresource,o=3Dgrid
> SITE_BDII ldap://hepgrid4.ph.liv.ac.uk:2170/mds-vo-name=3Dresource,o=3D=
grid
> ARGUS ldap://hepgrid9.ph.liv.ac.uk:2170/mds-vo-name=3Dresource,o=3Dgrid
>
> Why is that rubbishy old data still in there, I wonder?
>
> Steve
>
hi Stephen,
try a bdii restart on the site-bdii: sometimes a site-bdii continues to=20
publishe old information and a bdii restart is needed to clean them (at=20
least it is what we noticed in our NGI resource centres)

cheers,
Alessandro

--=20
Dr. Alessandro Paolini
INFN - CNAF
Viale Berti Pichat 6/2
40127 Bologna
Italy
tel: +39 051 6092723
fax: +39 051 6092916
ICQ: 192172027
skype: alex.paolini
**********************
"credo nel potere del riso e delle lacrime"
    "come antidoto all'odio ed al terrore"
         "un giorno senza un sorriso"
              "=E8 un giorno perso" >>> Charlie Chaplin


--------------010700010608080105050501
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body text="#FFFFFF" bgcolor="#666666">
    <div class="moz-cite-prefix">Il 29/05/2013 13:25, Stephen Jones ha
      scritto:<br>
    </div>
    <blockquote cite="mid:[log in to unmask]" type="cite">On
      05/29/2013 11:13 AM, Stephen Burke wrote:
      <br>
      <blockquote type="cite">LHC Computer Grid - Rollout
        [<a class="moz-txt-link-freetext" href="mailto:[log in to unmask]">mailto:[log in to unmask]</a>] said:
        <br>
        <blockquote type="cite">And 2880 * 60 is 172800, so they've
          switched it to seconds - is that weird and
          <br>
          pointless? I dunno.
          <br>
        </blockquote>
        Yes, it has switched to seconds in GLUE 2 - whether weird and
        pointless I'm not sure. In GLUE 1 it's inconsistent, some things
        are in seconds and some in minutes, so in GLUE 2 it's
        standardised on seconds for everything, but the first CREAM
        release still had minutes.
        <br>
        <br>
        The most fundamental question is what you think you've
        configured in the batch system - do you have CPU and Walltime
        limits, and if so what are they?
        <br>
        <br>
        Stephen
        <br>
        <br>
        <br>
      </blockquote>
      <br>
      Hi Stephen,
      <br>
      <br>
      Perhaps the old hepgrid97 data is cached in here
      /var/lib/bdii/old.ldif ?
      <br>
      <br>
      [root@hepgrid4 ~]# grep hepgrid97 /var/lib/bdii/old.ldif | wc
      <br>
      &nbsp;&nbsp; 5871&nbsp;&nbsp;&nbsp; 8711&nbsp; 395566
      <br>
      <br>
      That file contains old data from hepgrid97, even though I took
      hepgrid97 CE out of the BDII a few minutes ago:
      <br>
      <br>
      [root@hepgrid4 ~]# cat /etc/bdii/gip/site-urls.conf
      <br>
      CE&nbsp; <a class="moz-txt-link-freetext" href="ldap://hepgrid5.ph.liv.ac.uk:2170/mds-vo-name=resource,o=grid">ldap://hepgrid5.ph.liv.ac.uk:2170/mds-vo-name=resource,o=grid</a>
      <br>
      CE2&nbsp; <a class="moz-txt-link-freetext" href="ldap://hepgrid6.ph.liv.ac.uk:2170/mds-vo-name=resource,o=grid">ldap://hepgrid6.ph.liv.ac.uk:2170/mds-vo-name=resource,o=grid</a>
      <br>
      CE3&nbsp;
      <a class="moz-txt-link-freetext" href="ldap://hepgrid10.ph.liv.ac.uk:2170/mds-vo-name=resource,o=grid">ldap://hepgrid10.ph.liv.ac.uk:2170/mds-vo-name=resource,o=grid</a>
      <br>
      SE&nbsp; <a class="moz-txt-link-freetext" href="ldap://hepgrid11.ph.liv.ac.uk:2170/mds-vo-name=resource,o=grid">ldap://hepgrid11.ph.liv.ac.uk:2170/mds-vo-name=resource,o=grid</a>
      <br>
      SITE_BDII
      <a class="moz-txt-link-freetext" href="ldap://hepgrid4.ph.liv.ac.uk:2170/mds-vo-name=resource,o=grid">ldap://hepgrid4.ph.liv.ac.uk:2170/mds-vo-name=resource,o=grid</a>
      <br>
      ARGUS&nbsp;
      <a class="moz-txt-link-freetext" href="ldap://hepgrid9.ph.liv.ac.uk:2170/mds-vo-name=resource,o=grid">ldap://hepgrid9.ph.liv.ac.uk:2170/mds-vo-name=resource,o=grid</a>
      <br>
      <br>
      Why is that rubbishy old data still in there, I wonder?
      <br>
      <br>
      Steve
      <br>
      <br>
    </blockquote>
    hi Stephen,<br>
    try a bdii restart on the site-bdii: sometimes a site-bdii continues
    to publishe old information and a bdii restart is needed to clean
    them (at least it is what we noticed in our NGI resource centres)<br>
    <br>
    cheers,<br>
    Alessandro<br>
    <pre class="moz-signature" cols="72">-- 
Dr. Alessandro Paolini
INFN - CNAF
Viale Berti Pichat 6/2
40127 Bologna
Italy
tel: +39 051 6092723
fax: +39 051 6092916
ICQ: 192172027
skype: alex.paolini
**********************
"credo nel potere del riso e delle lacrime"
   "come antidoto all'odio ed al terrore"
        "un giorno senza un sorriso"
             "&egrave; un giorno perso" &gt;&gt;&gt; Charlie Chaplin</pre>
  </body>
</html>

--------------010700010608080105050501--
=========================================================================
Date:         Wed, 29 May 2013 12:54:03 +0000
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Maria Alandes Pradillo <[log in to unmask]>
Subject:      Re: Publication of MaxCPUTime and MaxWallTime glue attributes
In-Reply-To:  <[log in to unmask]>
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Message-ID:  <[log in to unmask]>

Dear Stephen,

Why are you publishing obsolete entries?

As far as both GLUE 1 and GLUE 2 are concerned, top and site BDIIs have a c=
ache file which caches information for 10min under /var/lib/bdii/gip/cache/=
gip/. If you remove a resource from your site in your site-urls.conf and yo=
u restart your BDII immediately, you will still see the resource published =
during 10min because of the cache.

This entry will be removed from the cache after 10min in the case of GLUE 1=
. In the case of GLUE 2, it will also be removed from the cache file, but u=
nfortunately not from the GLUE 2 LDAP tree due to a known bug https://savan=
nah.cern.ch/bugs/?101237.

Only after restarting the BDII and cleaning the cache to make sure we don=
=B4t populate again the tree, we will be able to get rid of those obsolete =
GLUE 2 entries.

I guess this is what you must have done since I can=B4t see that resource a=
ny more in your site BDII, for both GLUE 1 and GLUE 2:

[root@top-bdii-sl6 ~]# ldapsearch -LLL -x -h hepgrid4.ph.liv.ac.uk -p 2170 =
-b o=3Dgrid '(objectClass=3DGlueCE)' GlueCEUniqueID | grep GlueCEUniqueID:
GlueCEUniqueID: hepgrid5.ph.liv.ac.uk:8443/cream-pbs-long
GlueCEUniqueID: hepgrid6.ph.liv.ac.uk:8443/cream-pbs-long
GlueCEUniqueID: hepgrid10.ph.liv.ac.uk:8443/cream-pbs-long

 [root@top-bdii-sl6 ~]# ldapsearch -LLL -x -h hepgrid4.ph.liv.ac.uk -p 2170=
 -b o=3Dglue '(objectClass=3DGLUE2ComputingService)' GLUE2ServiceID | grep =
GLUE2ServiceID:
GLUE2ServiceID: hepgrid6.ph.liv.ac.uk_ComputingElement
GLUE2ServiceID: hepgrid5.ph.liv.ac.uk_ComputingElement
GLUE2ServiceID: hepgrid10.ph.liv.ac.uk_ComputingElement

So this is a known issue. I=B4m planning to release a fix for it very soon.=
 It affects all sites and we have already done a small cleaning campaign ve=
ry recently in those sites publishing a high number of obsolete entries.

Now, coming back to my original request. There are indeed obsolete GLUE 2 S=
hares publishing 999999999 (at least 27 sites are suffering from this) due =
to the mentioned bug :-(:

[root@top-bdii-sl6 ~]#  ldapsearch -LLL -x -h localhost -p 2170 -b GLUE2Gro=
upID=3Dgrid,o=3Dglue '(&(&(objectClass=3DGLUE2ComputingShare)(GLUE2Computin=
gShareMaxCPUTime=3D999999999)(!(GLUE2EntityCreationTime=3D2013-05-29*))))' =
GLUE2ShareID | grep GLUE2ShareID: | sort | uniq | wc -l
608
[root@top-bdii-sl6 ~]#  ldapsearch -LLL -x -h localhost -p 2170 -b GLUE2Gro=
upID=3Dgrid,o=3Dglue '(&(&(objectClass=3DGLUE2ComputingShare)(GLUE2Computin=
gShareMaxCPUTime=3D999999999)(!(GLUE2EntityCreationTime=3D2013-05-29*))))' =
| perl -p00e 's/\r?\n //g' | grep dn: | cut -d"=3D" -f5 | cut -d"," -f1 | s=
ort | uniq | wc -l
27

But I can still see up to date GLUE 2 Shares publishing 999999999 (at least=
 in 13 sites):

[root@top-bdii-sl6 ~]#  ldapsearch -LLL -x -h localhost -p 2170 -b GLUE2Gro=
upID=3Dgrid,o=3Dglue '(&(&(objectClass=3DGLUE2ComputingShare)(GLUE2Computin=
gShareMaxCPUTime=3D999999999)(GLUE2EntityCreationTime=3D2013-05-29*)))' GLU=
E2ShareID | grep GLUE2ShareID: | sort | uniq | wc -l
199
[root@top-bdii-sl6 ~]#  ldapsearch -LLL -x -h localhost -p 2170 -b GLUE2Gro=
upID=3Dgrid,o=3Dglue '(&(&(objectClass=3DGLUE2ComputingShare)(GLUE2Computin=
gShareMaxCPUTime=3D999999999)(GLUE2EntityCreationTime=3D2013-05-29*)))' | p=
erl -p00e 's/\r?\n //g' | grep dn: | cut -d"=3D" -f5 | cut -d"," -f1 | sort=
 | uniq | wc -l
13

So for those ones, as Stephen Burke has explained it, we would like to unde=
rstand it it is intentional to publish 999999999 and what the reason is. If=
 not, are the batch system parameters being properly configured?=20

Thanks a lot in advance,
Maria
=========================================================================
Date:         Wed, 29 May 2013 13:16:46 +0000
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Brian Davies <[log in to unmask]>
Subject:      voms-proxy-init error with  SL6 EMI-3 UI
Content-Type: multipart/alternative;
              boundary="_000_47595AB42F9D5449900E5ADBD1A6230A4D9E3560EXCHMBX01fedccl_"
MIME-Version: 1.0
Message-ID:  <[log in to unmask]>

--_000_47595AB42F9D5449900E5ADBD1A6230A4D9E3560EXCHMBX01fedccl_
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

Hi,
We have an SL6 EMI-3 UI at RAL  in the UK .
Voms-proxy-init works for  dteam and CMS. However when trying to use an ATL=
AS certificate WE get the following error meaasge:

-bash-4.1$ voms-proxy-init -voms atlas
Enter GRID pass phrase for this identity:
Contacting vo.racf.bnl.gov:15003 [/DC=3Dorg/DC=3Ddoegrids/OU=3DServices/CN=
=3Dvo.racf.bnl.gov] "atlas"...
Unexpected end of file from server

Questions:
Has anyone seen this error before?
Does anyone have a working SL6 EMI-3 UI that ATLAS have uses?
Is there anyway to force voms-proxy-init to use the CERN voms server rather=
 than the BNL server?
Is there a problem with the BNL server?

In debug mode  the end of the process is as follows:

Loaded vomses information 'VOMSServerInfo [alias=3Datlas, voName=3Datlas, U=
RL=3Dvoms://lcg-voms.cern.ch:15001, vomsServerDN=3D/DC=3Dch/DC=3Dcern/OU=3D=
computers/CN=3Dlcg-voms.cern.ch]' from /home/tier1/davie              sbg/.=
glite/vomses.
Contacting vo.racf.bnl.gov:15003 [/DC=3Dorg/DC=3Ddoegrids/OU=3DServices/CN=
=3Dvo.racf.bnl.gov] "atlas"...
Sent HTTP request for https://vo.racf.bnl.gov:15003/generate-ac?fqans=3D/at=
las&lifetime=3D43200
Unexpected end of file from server
org.italiangrid.voms.VOMSError: Unexpected end of file from server
        at org.italiangrid.voms.request.impl.RESTProtocol.doRequest(RESTPro=
tocol.java:95)
        at org.italiangrid.voms.request.impl.DefaultVOMSACService.doRequest=
(DefaultVOMSACService.java:136)
        at org.italiangrid.voms.request.impl.DefaultVOMSACService.getVOMSAt=
tributeCertificate(DefaultVOMSACService.java:184)
        at org.italiangrid.voms.clients.impl.DefaultVOMSProxyInitBehaviour.=
getAttributeCertificates(DefaultVOMSProxyInitBehaviour.java:446)
        at org.italiangrid.voms.clients.impl.DefaultVOMSProxyInitBehaviour.=
initProxy(DefaultVOMSProxyInitBehaviour.java:169)
        at org.italiangrid.voms.clients.VomsProxyInit.execute(VomsProxyInit=
.java:263)
        at org.italiangrid.voms.clients.VomsProxyInit.<init>(VomsProxyInit.=
java:55)
        at org.italiangrid.voms.clients.VomsProxyInit.main(VomsProxyInit.ja=
va:40)
Caused by: java.net.SocketException: Unexpected end of file from server
       at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:770)
        at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
        at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:767)
        at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpU=
RLConnection.java:1162)
        at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.jav=
a:397)
        at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCod=
e(HttpsURLConnectionImpl.java:338)
        at org.italiangrid.voms.request.impl.RESTProtocol.doRequest(RESTPro=
tocol.java:88)
        ... 7 more

Brian

-- =0AScanned by iCritical.=0A

--_000_47595AB42F9D5449900E5ADBD1A6230A4D9E3560EXCHMBX01fedccl_
Content-Type: text/html; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

<html xmlns:v=3D"urn:schemas-microsoft-com:vml" xmlns:o=3D"urn:schemas-micr=
osoft-com:office:office" xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" xmlns=3D"http:=
//www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dus-ascii"=
>
<meta name=3D"Generator" content=3D"Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
	{font-family:Calibri;
	panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0cm;
	margin-bottom:.0001pt;
	font-size:11.0pt;
	font-family:"Calibri","sans-serif";
	mso-fareast-language:EN-US;}
a:link, span.MsoHyperlink
	{mso-style-priority:99;
	color:blue;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{mso-style-priority:99;
	color:purple;
	text-decoration:underline;}
span.EmailStyle17
	{mso-style-type:personal-compose;
	font-family:"Calibri","sans-serif";
	color:windowtext;}
.MsoChpDefault
	{mso-style-type:export-only;
	font-family:"Calibri","sans-serif";
	mso-fareast-language:EN-US;}
@page WordSection1
	{size:612.0pt 792.0pt;
	margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
	{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext=3D"edit">
<o:idmap v:ext=3D"edit" data=3D"1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang=3D"EN-GB" link=3D"blue" vlink=3D"purple">
<div class=3D"WordSection1">
<p class=3D"MsoNormal">Hi,<o:p></o:p></p>
<p class=3D"MsoNormal">We have an SL6 EMI-3 UI at RAL&nbsp; in the UK .<o:p=
></o:p></p>
<p class=3D"MsoNormal">Voms-proxy-init works for &nbsp;dteam and CMS. Howev=
er when trying to use an ATLAS certificate WE get the following error meaas=
ge:<o:p></o:p></p>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
<p class=3D"MsoNormal">-bash-4.1$ voms-proxy-init -voms atlas<o:p></o:p></p=
>
<p class=3D"MsoNormal">Enter GRID pass phrase for this identity:<o:p></o:p>=
</p>
<p class=3D"MsoNormal">Contacting vo.racf.bnl.gov:15003 [/DC=3Dorg/DC=3Ddoe=
grids/OU=3DServices/CN=3Dvo.racf.bnl.gov] &quot;atlas&quot;...<o:p></o:p></=
p>
<p class=3D"MsoNormal">Unexpected end of file from server<o:p></o:p></p>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
<p class=3D"MsoNormal">Questions:<o:p></o:p></p>
<p class=3D"MsoNormal">Has anyone seen this error before?<o:p></o:p></p>
<p class=3D"MsoNormal">Does anyone have a working SL6 EMI-3 UI that ATLAS h=
ave uses?<o:p></o:p></p>
<p class=3D"MsoNormal">Is there anyway to force voms-proxy-init to use the =
CERN voms server rather than the BNL server?<o:p></o:p></p>
<p class=3D"MsoNormal">Is there a problem with the BNL server?<o:p></o:p></=
p>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
<p class=3D"MsoNormal">In debug mode &nbsp;the end of the process is as fol=
lows:<o:p></o:p></p>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
<p class=3D"MsoNormal">Loaded vomses information 'VOMSServerInfo [alias=3Da=
tlas, voName=3Datlas, URL=3Dvoms://lcg-voms.cern.ch:15001, vomsServerDN=3D/=
DC=3Dch/DC=3Dcern/OU=3Dcomputers/CN=3Dlcg-voms.cern.ch]' from /home/tier1/d=
avie&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp=
;&nbsp; sbg/.glite/vomses.<o:p></o:p></p>
<p class=3D"MsoNormal">Contacting vo.racf.bnl.gov:15003 [/DC=3Dorg/DC=3Ddoe=
grids/OU=3DServices/CN=3Dvo.racf.bnl.gov] &quot;atlas&quot;...<o:p></o:p></=
p>
<p class=3D"MsoNormal">Sent HTTP request for https://vo.racf.bnl.gov:15003/=
generate-ac?fqans=3D/atlas&amp;lifetime=3D43200<o:p></o:p></p>
<p class=3D"MsoNormal">Unexpected end of file from server<o:p></o:p></p>
<p class=3D"MsoNormal">org.italiangrid.voms.VOMSError: Unexpected end of fi=
le from server<o:p></o:p></p>
<p class=3D"MsoNormal">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at org.it=
aliangrid.voms.request.impl.RESTProtocol.doRequest(RESTProtocol.java:95)<o:=
p></o:p></p>
<p class=3D"MsoNormal">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at org.it=
aliangrid.voms.request.impl.DefaultVOMSACService.doRequest(DefaultVOMSACSer=
vice.java:136)<o:p></o:p></p>
<p class=3D"MsoNormal">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at org.it=
aliangrid.voms.request.impl.DefaultVOMSACService.getVOMSAttributeCertificat=
e(DefaultVOMSACService.java:184)<o:p></o:p></p>
<p class=3D"MsoNormal">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at org.it=
aliangrid.voms.clients.impl.DefaultVOMSProxyInitBehaviour.getAttributeCerti=
ficates(DefaultVOMSProxyInitBehaviour.java:446)<o:p></o:p></p>
<p class=3D"MsoNormal">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at org.it=
aliangrid.voms.clients.impl.DefaultVOMSProxyInitBehaviour.initProxy(Default=
VOMSProxyInitBehaviour.java:169)<o:p></o:p></p>
<p class=3D"MsoNormal">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at org.it=
aliangrid.voms.clients.VomsProxyInit.execute(VomsProxyInit.java:263)<o:p></=
o:p></p>
<p class=3D"MsoNormal">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at org.it=
aliangrid.voms.clients.VomsProxyInit.&lt;init&gt;(VomsProxyInit.java:55)<o:=
p></o:p></p>
<p class=3D"MsoNormal">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at org.it=
aliangrid.voms.clients.VomsProxyInit.main(VomsProxyInit.java:40)<o:p></o:p>=
</p>
<p class=3D"MsoNormal">Caused by: java.net.SocketException: Unexpected end =
of file from server<o:p></o:p></p>
<p class=3D"MsoNormal">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;at sun.net=
.www.http.HttpClient.parseHTTPHeader(HttpClient.java:770)<o:p></o:p></p>
<p class=3D"MsoNormal">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at sun.ne=
t.www.http.HttpClient.parseHTTP(HttpClient.java:633)<o:p></o:p></p>
<p class=3D"MsoNormal">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at sun.ne=
t.www.http.HttpClient.parseHTTPHeader(HttpClient.java:767)<o:p></o:p></p>
<p class=3D"MsoNormal">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at sun.ne=
t.www.http.HttpClient.parseHTTP(HttpClient.java:633)<o:p></o:p></p>
<p class=3D"MsoNormal">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at sun.ne=
t.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java=
:1162)<o:p></o:p></p>
<p class=3D"MsoNormal">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at java.n=
et.HttpURLConnection.getResponseCode(HttpURLConnection.java:397)<o:p></o:p>=
</p>
<p class=3D"MsoNormal">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at sun.ne=
t.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnect=
ionImpl.java:338)<o:p></o:p></p>
<p class=3D"MsoNormal">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at org.it=
aliangrid.voms.request.impl.RESTProtocol.doRequest(RESTProtocol.java:88)<o:=
p></o:p></p>
<p class=3D"MsoNormal">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ... 7 mor=
e<o:p></o:p></p>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
<p class=3D"MsoNormal">Brian<o:p></o:p></p>
</div>

<br>=
<p>-- =0A<BR>Scanned by iCritical.=0A</p>
<br>=
</body>
</html>

--_000_47595AB42F9D5449900E5ADBD1A6230A4D9E3560EXCHMBX01fedccl_--
=========================================================================
Date:         Wed, 29 May 2013 14:31:41 +0100
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Stephen Jones <[log in to unmask]>
Subject:      Re: Publication of MaxCPUTime and MaxWallTime glue attributes
Comments: cc: Maria Alandes Pradillo <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: quoted-printable
Message-ID:  <[log in to unmask]>

Hi Maria,

On 05/29/2013 01:54 PM, Maria Alandes Pradillo wrote:
> Why are you publishing obsolete entries?

Perhaps because bug https://savannah.cern.ch/bugs/?101237 kept obsolete=20
entries from an old server in the BDII output?

> This entry will be removed from the cache after 10min in the case of GL=
UE 1. In the case of GLUE 2, it will also be removed
> from the cache file, but unfortunately not from the GLUE 2 LDAP tree du=
e to a known bug
> https://savannah.cern.ch/bugs/?101237.

The reports I am making appear to be a rediscovery of=20
https://savannah.cern.ch/bugs/?101237

> Only after restarting the BDII and cleaning the cache to make sure we d=
on=B4t populate again the tree, we will be able to get rid of those obsol=
ete GLUE 2 entries.

That is what I have found today, with help from Alessandro and trail and=20
error.

> we have already done a small cleaning campaign very recently in those s=
ites publishing a high number of obsolete entries.

Re: the provenance of "obsolete entries" - they arose in another server,=20
also called hepgrid97, that had been built previously with EMI2. And=20
once they were in the site BDII, they did not get flushed as they should=20
have done when hepgrid97 was rebuilt with EMI3. They remained "stuck" in=20
the site BDII.  That is my assumption. Please check to see if any=20
obsolete values are being published. What are their tags? Are they gone=20
from Liverpool? If so, then the restart workaround for bug 101237 has=20
removed them.

> Now, coming back to my original request. There are indeed obsolete GLUE=
 2 Shares publishing 999999999 (at least 27 sites are suffering from this=
) due to the mentioned bug :-(:

We (Liverpool) are not in those lists. The workaround to bug=20
https://savannah.cern.ch/bugs/?101237 has fixed us. Suggest other sites=20
try that first, then dig deeper.

Please let me know if we can help further; meanwhile we await the=20
version of bdii that resolves bug 101237.

Cheers,


Steve


--=20
Steve Jones                             [log in to unmask]
System Administrator                    office: 220
High Energy Physics Division            tel (int): 42334
Oliver Lodge Laboratory                 tel (ext): +44 (0)151 794 2334
University of Liverpool                 http://www.liv.ac.uk/physics/hep/
=========================================================================
Date:         Wed, 29 May 2013 15:40:57 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Maarten Litmaath <[log in to unmask]>
Subject:      Re: voms-proxy-init error with  SL6 EMI-3 UI
Comments: To: Brian Davies <[log in to unmask]>
Comments: cc: Andrea Ceccanti <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed
Content-Transfer-Encoding: 7bit
Message-ID:  <[log in to unmask]>

Hi Brian,

> We have an SL6 EMI-3 UI at RAL  in the UK .
>
> Voms-proxy-init works for  dteam and CMS. However when trying to use an ATLAS certificate
> WE get the following error meaasge:
>
> -bash-4.1$ voms-proxy-init -voms atlas
>
> Enter GRID pass phrase for this identity:
>
> Contacting vo.racf.bnl.gov:15003 [/DC=org/DC=doegrids/OU=Services/CN=vo.racf.bnl.gov]
> "atlas"...
>
> Unexpected end of file from server

Please open a GGUS ticket about this matter; there appears to be an
incompatibility that could have serious consequences.
=========================================================================
Date:         Wed, 29 May 2013 15:43:11 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Marcus Freudeberg <[log in to unmask]>
Subject:      Re: voms-proxy-init error with  SL6 EMI-3 UI
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
Message-ID:  <[log in to unmask]>

On Wed, May 29, 2013 at 01:16:46PM +0000, Brian Davies wrote:
> -bash-4.1$ voms-proxy-init -voms atlas
> Enter GRID pass phrase for this identity:
> Contacting vo.racf.bnl.gov:15003 [/DC=org/DC=doegrids/OU=Services/CN=vo.racf.bnl.gov] "atlas"...
> Unexpected end of file from server
> 
> Questions:
> Has anyone seen this error before?
> Does anyone have a working SL6 EMI-3 UI that ATLAS have uses?
> Is there anyway to force voms-proxy-init to use the CERN voms server rather than the BNL server?
> Is there a problem with the BNL server?



Hi Brian!

1. This exactly the same error message that we got from our EMI-3 UI on SL6.3.

2. Yes.

3. To force another server you can delete the file /etc/vomses/atlas-vo.racf.bnl.gov on your UI
(temporary fix until your next YAIM run). In our case we are authenticating against
lcg-voms.cern.ch now.

4. I have no idea.


There are already tickets concerning this issue:

https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=3320
https://ggus.eu/ws/ticket_info.php?ticket=94300


Cheers,
Marcus.

-- 
Marcus Freudeberg
[log in to unmask]
=========================================================================
Date:         Wed, 29 May 2013 15:49:18 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Alessandro Paolini <[log in to unmask]>
Subject:      Re: voms-proxy-init error with  SL6 EMI-3 UI
Comments: cc: Maarten Litmaath <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: quoted-printable
Message-ID:  <[log in to unmask]>

Il 29/05/2013 15:40, Maarten Litmaath ha scritto:
> Hi Brian,
>
>> We have an SL6 EMI-3 UI at RAL  in the UK .
>>
>> Voms-proxy-init works for  dteam and CMS. However when trying to use=20
>> an ATLAS certificate
>> WE get the following error meaasge:
>>
>> -bash-4.1$ voms-proxy-init -voms atlas
>>
>> Enter GRID pass phrase for this identity:
>>
>> Contacting vo.racf.bnl.gov:15003=20
>> [/DC=3Dorg/DC=3Ddoegrids/OU=3DServices/CN=3Dvo.racf.bnl.gov]
>> "atlas"...
>>
>> Unexpected end of file from server
>
> Please open a GGUS ticket about this matter; there appears to be an
> incompatibility that could have serious consequences.
Hi,
I see that they have "VOMS Admin version 2.7.0" but I cannot determine=20
the voms server version (it doesn't answer on the 2170 port)

is that server working fine or do you get problems also using another UI?

cheers,
Alessandro

--=20
Dr. Alessandro Paolini
INFN - CNAF
Viale Berti Pichat 6/2
40127 Bologna
Italy
tel: +39 051 6092723
fax: +39 051 6092916
ICQ: 192172027
skype: alex.paolini
**********************
"credo nel potere del riso e delle lacrime"
    "come antidoto all'odio ed al terrore"
         "un giorno senza un sorriso"
              "=E8 un giorno perso" >>> Charlie Chaplin
=========================================================================
Date:         Wed, 29 May 2013 15:50:08 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Maarten Litmaath <[log in to unmask]>
Subject:      Re: voms-proxy-init error with  SL6 EMI-3 UI
Comments: cc: Alessandro Paolini <[log in to unmask]>,
          Andrea Ceccanti <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed
Content-Transfer-Encoding: 7bit
Message-ID:  <[log in to unmask]>

Ciao Alessandro,

> is that server working fine or do you get problems also using another UI?

The BNL server is working fine, no problem when an EMI-2 UI is used.
=========================================================================
Date:         Wed, 29 May 2013 15:53:37 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Maarten Litmaath <[log in to unmask]>
Subject:      Re: voms-proxy-init error with  SL6 EMI-3 UI
Comments: To: Brian Davies <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed
Content-Transfer-Encoding: 7bit
Message-ID:  <[log in to unmask]>

Hi Brian,

> Is there anyway to force voms-proxy-init to use the CERN voms server

Servers, there are 2: lcg-voms.cern.ch and voms.cern.ch.
Both should be configured.

> rather than the BNL server?

As Marcus pointed out, removing that server from the configuration
would be one way.  Users can also have their own configuration lines
in ~/.glite/vomses, allowing for convenient aliases to be defined
for any particular VOMS server.  For example, to explicitly _target_
the BNL server:

--------------------------------------------------------------------
$ grep bnl ~/.glite/vomses | fold -s
"bnl" "vo.racf.bnl.gov" "15003"
"/DC=org/DC=doegrids/OU=Services/CN=vo.racf.bnl.gov" "atlas"
--------------------------------------------------------------------

Then one can use the BNL server as follows:

     voms-proxy-init -voms bnl
     voms-proxy-init -voms bnl:/atlas/Role=.....
=========================================================================
Date:         Wed, 29 May 2013 15:54:27 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Alessandro Paolini <[log in to unmask]>
Subject:      Re: voms-proxy-init error with  SL6 EMI-3 UI
Comments: cc: Maarten Litmaath <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: quoted-printable
Message-ID:  <[log in to unmask]>

Il 29/05/2013 15:50, Maarten Litmaath ha scritto:
> Ciao Alessandro,
>
>> is that server working fine or do you get problems also using another=20
>> UI?
>
> The BNL server is working fine, no problem when an EMI-2 UI is used.
Hi Maarten,
by chance do you already know the voms version installed at BNL?

cheers,
Alessandro

--=20
Dr. Alessandro Paolini
INFN - CNAF
Viale Berti Pichat 6/2
40127 Bologna
Italy
tel: +39 051 6092723
fax: +39 051 6092916
ICQ: 192172027
skype: alex.paolini
**********************
"credo nel potere del riso e delle lacrime"
    "come antidoto all'odio ed al terrore"
         "un giorno senza un sorriso"
              "=E8 un giorno perso" >>> Charlie Chaplin
=========================================================================
Date:         Wed, 29 May 2013 13:56:06 +0000
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Maria Alandes Pradillo <[log in to unmask]>
Subject:      Re: Publication of MaxCPUTime and MaxWallTime glue attributes
Comments: To: Stephen Jones <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Message-ID:  <[log in to unmask]>

Dear Stephen,

> > Why are you publishing obsolete entries?

Sorry, with this question I just wanted to introduce the explanation. I thi=
nk I've just confused you. With my previous explanation I hope it was clear=
 why you are publishing obsolete entries. I think you have also understood =
that it was due to bug 101237. I have now included the workaround in the bu=
g for the record.

> Please let me know if we can help further; meanwhile we await the version=
 of
> bdii that resolves bug 101237.

Yes, you are a good example of a site who is not publishing 99999999. How h=
ave you configured your batch system in order to publish your MaxCPUTime an=
d MaxWallTime? I think that if it works for you, it means the info provider=
s are working.

Thanks!
Maria
=========================================================================
Date:         Wed, 29 May 2013 17:58:56 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Maarten Litmaath <[log in to unmask]>
Subject:      Re: voms-proxy-init error with  SL6 EMI-3 UI
Comments: To: Alessandro Paolini <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed
Content-Transfer-Encoding: 7bit
Message-ID:  <[log in to unmask]>

Ciao Alessandro,

> by chance do you already know the voms version installed at BNL?

No, but I have involved their team as well as Andrea Ceccanti.
=========================================================================
Date:         Thu, 30 May 2013 12:22:45 +0100
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         =?ISO-8859-1?Q?Gon=E7alo_Borges?= <[log in to unmask]>
Organization: LIP
Subject:      Re: Publication of MaxCPUTime and MaxWallTime glue attributes
Comments: To: Maria Alandes Pradillo <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: multipart/signed; protocol="application/pkcs7-signature";
              micalg=sha1; boundary="------------ms030703000804080103090002"
Message-ID:  <[log in to unmask]>

This is a cryptographically signed message in MIME format.

--------------ms030703000804080103090002
Content-Type: multipart/alternative;
 boundary="------------030901030009040809030004"

This is a multi-part message in MIME format.
--------------030901030009040809030004
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: quoted-printable

Hi Maria...

For GE sites,  those values are based on the following values defined=20
for the relevant queues:

# qconf -sq opsgrid | egrep "(h_rt|h_cpu)"
h_rt                  4:00:00
h_cpu                 2:00:00

I see these value properly captured by the GE IP (transformed in seconds)=
:


  # su ldap --shell /bin/bash --session-command=20
'/usr/libexec/glite-info-dynamic-sge --info -c /etc/lrms/scheduler.conf'
(...)
dn:=20
GLUE2ManagerId=3Dce03.ncg.ingrid.pt_ComputingElement_Manager,GLUE2Service=
ID=3Dce03.ncg.ingrid.pt_ComputingElement,GLUE2GroupID=3Dresource,o=3Dglue=

GLUE2ManagerProductVersion: 6.2u5

dn:=20
GLUE2ShareID=3Dopsgrid_ops_ce03.ncg.ingrid.pt_ComputingElement,GLUE2Servi=
ceID=3Dce03.ncg.ingrid.pt_ComputingElement,GLUE2GroupID=3Dresource,o=3Dgl=
ue
GLUE2ComputingShareMaxRunningJobs: 1284
*GLUE2ComputingShareMaxCPUTime: 7200**
**GLUE2ComputingShareMaxWallTime: 14400*
GLUE2ComputingShareServingState: Production
(...)

However, even in my own sites, I saw I was publishing the 999999=20
default. I've applied the workaround mentioned in the bug, and a direct=20
query to the site bdii
does not return any 99999999 result for those variables anymore.

# ldapsearch -LLL -x -h sbdii01.ncg.ingrid.pt -p 2170 -b o=3Dglue=20
'(&(objectClass=3DGLUE2ComputingShare)(GLUE2ComputingShareMaxCPUTime=3D99=
9999999))'
#

Cheers
Goncalo



On 05/28/2013 04:08 PM, Maria Alandes Pradillo wrote:
> Dear all,
>
> I'm trying to understand why there are so many WLCG sites publishing th=
e default values of GLUE2ComputingShareMaxCPUTime and (less cases) GLUE2C=
omputingShareMaxWallTime.
>
> In the case of CREAM, these two variables are published by default as:
>
> GLUE2ComputingShareMaxCPUTime: 999999999
> GLUE2ComputingShareMaxWallTime: 999999999
>
> Then these variables are modified by the local batch system information=
 provider, which uses batch system configuration variables to populate th=
e BDII:
>
> In case of Torque:
>
> GLUE2ComputingShareMaxWallTime: resources_default.walltime if defined, =
else resources_max.walltime
> GLUE2ComputingShareMaxCPUTime: resources_default.cput if defined, else =
resources_max.cput
>
> In case of LSF:
>
> GLUE2ComputingShareMaxWallTime: RUNLIMIT
> GLUE2ComputingShareMaxCPUTime: CPULIMIT
>
> I don't know how this is done for other batch systems, I'm happy to lea=
rn from the relevant people writing other information providers.
>
> So the question is, are sites defining the relevant variables in their =
local batch system configuration? If yes, could it be that they are runni=
ng old versions of the information providers and that is the reason why t=
he default values are published? AFAIK, the EMI 3 versions of info-dynami=
c-scheduler-pbs and info-dynamic-scheduler-lsf use the mentioned variable=
s (not sure if this is also the case for earlier versions). Could it also=
 be that the 999999999 values are defined on purpose meaning there is no =
limit for the CPU time in the corresponding queue?
>
> The MaxCPUTime variable is used by LHCb to calculate the queue length. =
It is therefore very important that there is a correct value defined ther=
e. Could some sys admins comment on how they are doing this and whether t=
hey publish 999999999 on purpose?
>
> The list of sites that are publishing default values is included below.=
 Thanks very much in advance for the feedback!
>
> Thanks!
> Maria
>
> ldapsearch -LLL -x -h localhost -p 2170 -b GLUE2GroupID=3Dgrid,o=3Dglue=
 '(&(objectClass=3DGLUE2ComputingShare)(GLUE2ComputingShareMaxCPUTime=3D9=
99999999))' | perl -p00e 's/\r?\n //g' | grep dn: | cut -d"=3D" -f5 | cut=
 -d"," -f1 | sort | uniq
> BelGrid-UCL
> CA-ALBERTA-WESTGRID-T2
> CA-SCINET-T2
> CA-VICTORIA-WESTGRID-T2
> CERN-PROD
> CYFRONET-LCG2
> DESY-HH
> FZK-LCG2
> GoeGrid
> GR-07-UOI-HEPLAB
> ICM
> IFCA-LCG2
> INFN-LNL-2
> INFN-NAPOLI-ATLAS
> INFN-PISA
> INFN-T1
> LIP-Coimbra
> LIP-Lisbon
> LRZ-LMU
> NCG-INGRID-PT
> praguelcg2
> PSNC
> ru-PNPI
> RU-SPbSU
> SFU-LCG2
> SiGNET
> TR-03-METU
> TR-10-ULAKBIM
> TRIUMF-LCG2
> UA-BITP
> UA-KNU
> UKI-NORTHGRID-LIV-HEP
> UKI-NORTHGRID-SHEF-HEP
> UKI-SOUTHGRID-RALPP
> UNI-FREIBURG
> wuppertalprod


--------------030901030009040809030004
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<html>
  <head>
    <meta content=3D"text/html; charset=3DISO-8859-1"
      http-equiv=3D"Content-Type">
  </head>
  <body bgcolor=3D"#FFFFFF" text=3D"#000000">
    <div class=3D"moz-cite-prefix">Hi Maria...<br>
      <br>
      For GE sites,&nbsp; those values are based on the following values
      defined for the relevant queues:<br>
      <br>
      # qconf -sq opsgrid | egrep "(h_rt|h_cpu)"<br>
      h_rt&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nb=
sp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 4:00:00<br>
      h_cpu&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 2:00:00<br>
      <br>
      I see these value properly captured by the GE IP (transformed in
      seconds):<br>
      <br>
      <br>
      &nbsp;# su ldap --shell /bin/bash --session-command
      '/usr/libexec/glite-info-dynamic-sge --info -c
      /etc/lrms/scheduler.conf'<br>
      (...)<br>
      dn:
GLUE2ManagerId=3Dce03.ncg.ingrid.pt_ComputingElement_Manager,GLUE2Service=
ID=3Dce03.ncg.ingrid.pt_ComputingElement,GLUE2GroupID=3Dresource,o=3Dglue=
<br>
      GLUE2ManagerProductVersion: 6.2u5<br>
      <br>
      dn:
GLUE2ShareID=3Dopsgrid_ops_ce03.ncg.ingrid.pt_ComputingElement,GLUE2Servi=
ceID=3Dce03.ncg.ingrid.pt_ComputingElement,GLUE2GroupID=3Dresource,o=3Dgl=
ue<br>
      GLUE2ComputingShareMaxRunningJobs: 1284<br>
      <b>GLUE2ComputingShareMaxCPUTime: 7200</b><b><br>
      </b><b>GLUE2ComputingShareMaxWallTime: 14400</b><br>
      GLUE2ComputingShareServingState: Production<br>
      (...)<br>
      <br>
      However, even in my own sites, I saw I was publishing the 999999
      default. I've applied the workaround mentioned in the bug, and a
      direct query to the site bdii<br>
      does not return any 99999999 result for those variables anymore.<br=
>
      <br>
      # ldapsearch -LLL -x -h sbdii01.ncg.ingrid.pt -p 2170 -b o=3Dglue
'(&amp;(objectClass=3DGLUE2ComputingShare)(GLUE2ComputingShareMaxCPUTime=3D=
999999999))'<br>
      # <br>
      <br>
      Cheers<br>
      Goncalo<br>
      <br>
      <br>
      <br>
      On 05/28/2013 04:08 PM, Maria Alandes Pradillo wrote:<br>
    </div>
    <blockquote
      cite=3D"mid:[log in to unmask]
n.ch"
      type=3D"cite">
      <pre wrap=3D"">Dear all,

I'm trying to understand why there are so many WLCG sites publishing the =
default values of GLUE2ComputingShareMaxCPUTime and (less cases) GLUE2Com=
putingShareMaxWallTime.

In the case of CREAM, these two variables are published by default as:

GLUE2ComputingShareMaxCPUTime: 999999999
GLUE2ComputingShareMaxWallTime: 999999999

Then these variables are modified by the local batch system information p=
rovider, which uses batch system configuration variables to populate the =
BDII:

In case of Torque:

GLUE2ComputingShareMaxWallTime: resources_default.walltime if defined, el=
se resources_max.walltime
GLUE2ComputingShareMaxCPUTime: resources_default.cput if defined, else re=
sources_max.cput

In case of LSF:

GLUE2ComputingShareMaxWallTime: RUNLIMIT
GLUE2ComputingShareMaxCPUTime: CPULIMIT

I don't know how this is done for other batch systems, I'm happy to learn=
 from the relevant people writing other information providers.

So the question is, are sites defining the relevant variables in their lo=
cal batch system configuration? If yes, could it be that they are running=
 old versions of the information providers and that is the reason why the=
 default values are published? AFAIK, the EMI 3 versions of info-dynamic-=
scheduler-pbs and info-dynamic-scheduler-lsf use the mentioned variables =
(not sure if this is also the case for earlier versions). Could it also b=
e that the 999999999 values are defined on purpose meaning there is no li=
mit for the CPU time in the corresponding queue?

The MaxCPUTime variable is used by LHCb to calculate the queue length. It=
 is therefore very important that there is a correct value defined there.=
 Could some sys admins comment on how they are doing this and whether the=
y publish 999999999 on purpose?

The list of sites that are publishing default values is included below. T=
hanks very much in advance for the feedback!

Thanks!
Maria

ldapsearch -LLL -x -h localhost -p 2170 -b GLUE2GroupID=3Dgrid,o=3Dglue '=
(&amp;(objectClass=3DGLUE2ComputingShare)(GLUE2ComputingShareMaxCPUTime=3D=
999999999))' | perl -p00e 's/\r?\n //g' | grep dn: | cut -d"=3D" -f5 | cu=
t -d"," -f1 | sort | uniq
BelGrid-UCL
CA-ALBERTA-WESTGRID-T2
CA-SCINET-T2
CA-VICTORIA-WESTGRID-T2
CERN-PROD
CYFRONET-LCG2
DESY-HH
FZK-LCG2
GoeGrid
GR-07-UOI-HEPLAB
ICM
IFCA-LCG2
INFN-LNL-2
INFN-NAPOLI-ATLAS
INFN-PISA
INFN-T1
LIP-Coimbra
LIP-Lisbon
LRZ-LMU
NCG-INGRID-PT
praguelcg2
PSNC
ru-PNPI
RU-SPbSU
SFU-LCG2
SiGNET
TR-03-METU
TR-10-ULAKBIM
TRIUMF-LCG2
UA-BITP
UA-KNU
UKI-NORTHGRID-LIV-HEP
UKI-NORTHGRID-SHEF-HEP
UKI-SOUTHGRID-RALPP
UNI-FREIBURG
wuppertalprod
</pre>
    </blockquote>
    <br>
  </body>
</html>

--------------030901030009040809030004--

--------------ms030703000804080103090002
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature

MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIID0jCC
A84wggK2oAMCAQICAga7MA0GCSqGSIb3DQEBBQUAMEMxCzAJBgNVBAYTAlBUMQ4wDAYDVQQK
EwVMSVBDQTEkMCIGA1UEAxMbTElQIENlcnRpZmljYXRpb24gQXV0aG9yaXR5MB4XDTEzMDQy
MjEzMzQxNVoXDTE0MDQyMjEzMzQxNVowVTELMAkGA1UEBhMCUFQxDjAMBgNVBAoTBUxJUENB
MQwwCgYDVQQKEwNMSVAxDzANBgNVBAsTBkxpc2JvYTEXMBUGA1UEAxMOR29uY2FsbyBCb3Jn
ZXMwgZ8wDQYJKoZIhvcNAQEBBQADgY0AMIGJAoGBAMO8jZDnePKsdv9ZSe6O94U0Od/CuDzW
iNCcRng5mApXvacIhjbRPPXCfzSTPkebJR7eVmnE68xG+Sajb+0RRXD42YV8qlxth92gyC2y
veqr4+JZTCgYz5v6/PheW+PRND978YCeCsmjxvmF/wLoFs4ZakLC18MOQ7p6l9/HTN+hAgMB
AAGjggE8MIIBODAMBgNVHRMBAf8EAjAAMBEGCWCGSAGG+EIBAQQEAwIFoDAOBgNVHQ8BAf8E
BAMCBLAwHQYDVR0OBBYEFHqOZ0i7Wz1mE8ZKFEyfe0xmXMXRMGsGA1UdIwRkMGKAFEKubveG
Hp7oaO/PeVM4Yk4A8kLsoUekRTBDMQswCQYDVQQGEwJQVDEOMAwGA1UEChMFTElQQ0ExJDAi
BgNVBAMTG0xJUCBDZXJ0aWZpY2F0aW9uIEF1dGhvcml0eYIBADAZBgNVHREEEjAQgQ5nb25j
YWxvQGxpcC5wdDAUBgNVHRIEDTALgQljYUBsaXAucHQwGQYDVR0gBBIwEDAOBgwrBgEEAcx2
CgEBBQEwLQYDVR0fBCYwJDAioCCgHoYcaHR0cDovL2NhLmxpcC5wdC9jcmwvY3JsLmRlcjAN
BgkqhkiG9w0BAQUFAAOCAQEALJdAJVqQDP78hSA8F4mJ9GT0cOjpt6WCU2d7ZDv9wVwDnGPu
YupyAWQo3Oc0GmUAI8XvOmPmM7hBVsdVY5e9WJNiCJEsK5fokZAt1GJCZpo+AKICn15UxfyF
+xmvcl+qsQbJ7zgA/7GP6bCCSpT9JpGd4VQj4ctV/SSJMN6UBgfKaY283Xu5YqFqg6QR70M+
Dsu2DKxPHh2K++L+nI0Q6qQ1dfZwjM/FlxqtMQE3Dpg/YPYvRC/wQvKAMdN3yiBYhuP4F8/O
lbOOP8shAfZC5oCpsvtfxut6NH0TO8YhuJVPDr8S6tM1bYPwXKkffq9Wmp47othz5j/4EKOz
b/NdATGCAnQwggJwAgEBMEkwQzELMAkGA1UEBhMCUFQxDjAMBgNVBAoTBUxJUENBMSQwIgYD
VQQDExtMSVAgQ2VydGlmaWNhdGlvbiBBdXRob3JpdHkCAga7MAkGBSsOAwIaBQCgggGBMBgG
CSqGSIb3DQEJAzELBgkqhkiG9w0BBwEwHAYJKoZIhvcNAQkFMQ8XDTEzMDUzMDExMjI0NVow
IwYJKoZIhvcNAQkEMRYEFEEsegX4/QDVgrLppzdDRtb4zS1bMFgGCSsGAQQBgjcQBDFLMEkw
QzELMAkGA1UEBhMCUFQxDjAMBgNVBAoTBUxJUENBMSQwIgYDVQQDExtMSVAgQ2VydGlmaWNh
dGlvbiBBdXRob3JpdHkCAga7MFoGCyqGSIb3DQEJEAILMUugSTBDMQswCQYDVQQGEwJQVDEO
MAwGA1UEChMFTElQQ0ExJDAiBgNVBAMTG0xJUCBDZXJ0aWZpY2F0aW9uIEF1dGhvcml0eQIC
BrswbAYJKoZIhvcNAQkPMV8wXTALBglghkgBZQMEASowCwYJYIZIAWUDBAECMAoGCCqGSIb3
DQMHMA4GCCqGSIb3DQMCAgIAgDANBggqhkiG9w0DAgIBQDAHBgUrDgMCBzANBggqhkiG9w0D
AgIBKDANBgkqhkiG9w0BAQEFAASBgLJq5R2D/UEjSiwMtWweHMG/q0vFiaNLodfZlpfzPEdm
mKHZ1QTdUh/fweFS4dejqUBHfZZDietKDus6Z/kqbbjPWLY6XDQpJ1cbxcNKInSmWY6wVFot
NG3WIHcrlt8e30gzvwy1FSos4x6JeH946wzHSRdt4xKVM+uuKEv2wm6tAAAAAAAA
--------------ms030703000804080103090002--
=========================================================================
Date:         Thu, 30 May 2013 15:14:12 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Fokke Dijkstra <[log in to unmask]>
Subject:      Re: Publication of MaxCPUTime and MaxWallTime glue attributes
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: multipart/signed; protocol="application/pkcs7-signature";
              micalg=sha1; boundary="------------ms060509090305040301090800"
Message-ID:  <[log in to unmask]>

This is a cryptographically signed message in MIME format.

--------------ms060509090305040301090800
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: quoted-printable

Hi,

I once started out with only wallclock time limits, just because that=20
was the thing I wanted to have enforced. Our site uses torque as a=20
resource manager. The standard cpu time limits also don't work well for=20
parallel jobs, as these accumulate cpu time faster than walltime.
Because of this I ran into problems with lhcb. First I hacked the=20
information system provider to publish the wallclock time limits as cpu=20
time limits as well.
Fortunately the torque pcput limits are now also supported by the=20
information system provider. These limits take the requested cores into=20
account and therefore also work correctly for parallel jobs.
It would also be better if lhcb checked both wallclock and cpu time=20
limits, as this would solve their problem as well.

Kind regards,

Fokke

On 29-05-13 12:16, Stephen Burke wrote:
> LHC Computer Grid - Rollout [mailto:[log in to unmask]]
>> On Behalf Of Oxana Smirnova said:
>> The 999999999 figure
>> probably is the suggested default in the specs, which in some cases is=

>> not overwritten by the actual values (likely when the actual values ar=
e
>> unknown).
> There are two possible reasons to get the "all nines" value: either the=
re's a bug in the info provider, or the batch system really has no limit =
set in which case that value effectively means infinite. What we're tryin=
g to establish is whether some sites intentionally don't set a time limit=
, and if so what the rationale is.
>
> Stephen
>


--=20
Fokke Dijkstra <[log in to unmask]>
High Performance Computing & Visualisation
Donald Smits Center for Information Technology, University of Groningen
Postbus 11044, 9700 CA  Groningen, The Netherlands
+31-50-363 9243



--------------ms060509090305040301090800
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature

MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJaTCC
BJ4wggOGoAMCAQICEQCXqISbh4s3AagBxIO27HljMA0GCSqGSIb3DQEBBQUAMDsxCzAJBgNV
BAYTAk5MMQ8wDQYDVQQKEwZURVJFTkExGzAZBgNVBAMTElRFUkVOQSBQZXJzb25hbCBDQTAe
Fw0xMjA5MjAwMDAwMDBaFw0xMzA5MjAyMzU5NTlaMGgxCzAJBgNVBAYTAk5MMSQwIgYDVQQK
ExtSaWprc3VuaXZlcnNpdGVpdCBHcm9uaW5nZW4xFDASBgNVBAMTC0YuIERpamtzdHJhMR0w
GwYJKoZIhvcNAQkCFg5QMTE0NjkxQHJ1Zy5ubDCCASIwDQYJKoZIhvcNAQEBBQADggEPADCC
AQoCggEBALoQ1X0PCu/h2c5DXkNCUSLtNXD4bnoRWWvY2KT9MQvvTbwe7arE9QOwueEpNzss
rHFxt0Svz6gwYJdPzBaAIgD7GLfvAXeUGe/hyiLnWrgfTY3vHXb2nmWMlyoWw1E/On+FS0IB
p/L+Hr9KcIULFsPI36uiWA00D2PiONeF+8BAwss+ReomCj9VrD6FRd4l/4sLDuqtHhfqWNkQ
JDgefgfKpnk2aqcB/Go3GEbPyD8gCaJ9V3cl4qvd/UJEQHLjeWSGnC5YBwvQF/g8ZQ8RF6+W
hCMHnBdZlY58Fog6h0WAUqm6rr4P4JrO12Y3xgZ4nCfMf5ZzMBJj7SFV7e2QqH8CAwEAAaOC
AW4wggFqMB8GA1UdIwQYMBaAFGNNQ1oZSD/ERsECur/uDuWCt2amMB0GA1UdDgQWBBTtqhCd
E7kJIDpdUlMkIQbgBr+5GTAOBgNVHQ8BAf8EBAMCBaAwDAYDVR0TAQH/BAIwADAdBgNVHSUE
FjAUBggrBgEFBQcDBAYIKwYBBQUHAwIwGAYDVR0gBBEwDzANBgsrBgEEAbIxAQICHTA/BgNV
HR8EODA2MDSgMqAwhi5odHRwOi8vY3JsLnRjcy50ZXJlbmEub3JnL1RFUkVOQVBlcnNvbmFs
Q0EuY3JsMHIGCCsGAQUFBwEBBGYwZDA6BggrBgEFBQcwAoYuaHR0cDovL2NydC50Y3MudGVy
ZW5hLm9yZy9URVJFTkFQZXJzb25hbENBLmNydDAmBggrBgEFBQcwAYYaaHR0cDovL29jc3Au
dGNzLnRlcmVuYS5vcmcwHAYDVR0RBBUwE4ERRi5EaWprc3RyYUBydWcubmwwDQYJKoZIhvcN
AQEFBQADggEBAFprNKWQeVNY1U4z125+QN5kW7yNVRLf+MhndvP3mPtW1ISlTGiIHmK0BHLm
gmfLXeuEDN04QYpdGbVAv1ddYwSQou8VX2/jk5Wb2+eHDJVYfQeNOzeoj1s2WrRWAzydCwXP
qLQwFQTrfqtq9ulMUyO/DwYpm34nZ3NrpgyrEOCnG0oRudLy2CaaN+B+CP6+5JyyWS2h407E
66Hm2U4ilFrVc5FBsfazEBnTfMGlQdvh/ZPUxwewrphfiGvDkBW9ghjPHz7XYoHKQL7ZbIik
avfpDGwNlUGHS5k+f6gn//840pxFpGa4hvhzkxaWhC6XfnqNVuVNSl5yU+Tqh9TTwycwggTD
MIIDq6ADAgECAhBz/lf637jFCIF7Zrlr8C3vMA0GCSqGSIb3DQEBBQUAMIGuMQswCQYDVQQG
EwJVUzELMAkGA1UECBMCVVQxFzAVBgNVBAcTDlNhbHQgTGFrZSBDaXR5MR4wHAYDVQQKExVU
aGUgVVNFUlRSVVNUIE5ldHdvcmsxITAfBgNVBAsTGGh0dHA6Ly93d3cudXNlcnRydXN0LmNv
bTE2MDQGA1UEAxMtVVROLVVTRVJGaXJzdC1DbGllbnQgQXV0aGVudGljYXRpb24gYW5kIEVt
YWlsMB4XDTA5MDUxODAwMDAwMFoXDTI4MTIzMTIzNTk1OVowOzELMAkGA1UEBhMCTkwxDzAN
BgNVBAoTBlRFUkVOQTEbMBkGA1UEAxMSVEVSRU5BIFBlcnNvbmFsIENBMIIBIjANBgkqhkiG
9w0BAQEFAAOCAQ8AMIIBCgKCAQEAyBXZ9TNqI6GQDc+7BUTDqx9KNYUaIYWgT/jwQOJKQ5v+
W7Gwv7RX3HWAQUtkGvbbT2+P0CVFNfnqy0r6+9rT7UWIEZQ25MyoDe/FPTftFnvjwpWeWDN/
Ivv4/+zmvtuuCmUlIofab4SLRuhAhig/v1YI4krpg6LpIvst+rYoH5HBw3H7U8ArTqQMoW6d
Ve3s4SSHOgjiDRzkxE3Qyyf6hGTm0ZedViRbk7spLkPiQWo94kpl/JpfWoaHvIfHeYCWmVHG
kA9kkZl9EN2sLAMq4Xhk/s49TvQrUBFL0VjUmwPwf/U7U7BTQ/vFL8QEKRo6rNdV6dEOldE7
MX94T64pLQIDAQABo4IBTTCCAUkwHwYDVR0jBBgwFoAUiYJnfcSdJnAAS7RQSHzePa4Ebn0w
HQYDVR0OBBYEFGNNQ1oZSD/ERsECur/uDuWCt2amMA4GA1UdDwEB/wQEAwIBBjASBgNVHRMB
Af8ECDAGAQH/AgEAMBgGA1UdIAQRMA8wDQYLKwYBBAGyMQECAh0wWAYDVR0fBFEwTzBNoEug
SYZHaHR0cDovL2NybC51c2VydHJ1c3QuY29tL1VUTi1VU0VSRmlyc3QtQ2xpZW50QXV0aGVu
dGljYXRpb25hbmRFbWFpbC5jcmwwbwYIKwYBBQUHAQEEYzBhMDgGCCsGAQUFBzAChixodHRw
Oi8vY3J0LnVzZXJ0cnVzdC5jb20vVVROQUFBQ2xpZW50X0NBLmNydDAlBggrBgEFBQcwAYYZ
aHR0cDovL29jc3AudXNlcnRydXN0LmNvbTANBgkqhkiG9w0BAQUFAAOCAQEABiupUy8T3Fw5
FsyGn15Me3L77I1Vil6aCv9TTHb0Bj1Qz1fwos+vmYyq/qAZdj6ZAzL6dYM4irtrmqUME7LU
G3bmlC5nmFnjkWwCkJqcyGBLVavKiFqNK+VplQMH0dQO/CQiLlmxY6Rf7dkjcuSczjpcbB9P
qQDJHf76f0Utti6E3Q8noFkYTtV2JUX0mSZ522+fI/dDuysPBKOBJiy3ezX5PXdfQCHmfx2l
llq90MsWOmy7YYuK/QQ5RArLLOHLzi4QmBrb4JPtSWRkCCCft6NQ8KLdyrTGfAw9514V3CeG
5Do7UloXq6kGUyudCXNkHAHD/TDShwNv5BUDejlfaDGCAwowggMGAgEBMFAwOzELMAkGA1UE
BhMCTkwxDzANBgNVBAoTBlRFUkVOQTEbMBkGA1UEAxMSVEVSRU5BIFBlcnNvbmFsIENBAhEA
l6iEm4eLNwGoAcSDtux5YzAJBgUrDgMCGgUAoIIBjzAYBgkqhkiG9w0BCQMxCwYJKoZIhvcN
AQcBMBwGCSqGSIb3DQEJBTEPFw0xMzA1MzAxMzE0MTJaMCMGCSqGSIb3DQEJBDEWBBQbgZVh
tHQYVA1uBuTfQHK14RLIyzBfBgkrBgEEAYI3EAQxUjBQMDsxCzAJBgNVBAYTAk5MMQ8wDQYD
VQQKEwZURVJFTkExGzAZBgNVBAMTElRFUkVOQSBQZXJzb25hbCBDQQIRAJeohJuHizcBqAHE
g7bseWMwYQYLKoZIhvcNAQkQAgsxUqBQMDsxCzAJBgNVBAYTAk5MMQ8wDQYDVQQKEwZURVJF
TkExGzAZBgNVBAMTElRFUkVOQSBQZXJzb25hbCBDQQIRAJeohJuHizcBqAHEg7bseWMwbAYJ
KoZIhvcNAQkPMV8wXTALBglghkgBZQMEASowCwYJYIZIAWUDBAECMAoGCCqGSIb3DQMHMA4G
CCqGSIb3DQMCAgIAgDANBggqhkiG9w0DAgIBQDAHBgUrDgMCBzANBggqhkiG9w0DAgIBKDAN
BgkqhkiG9w0BAQEFAASCAQCePtD7lIYNM1Q/FpPWE8A0++VfagvCNyBiv/1s/PHtgbuXVSCC
L4RzN+zeg5Znb2MOxVJE03IpRr/o1mdTcAZS2cMeN94MQ9XY0Z1VQJdFdp2ulHhFZn6zB1Dy
cRrIPDgoZXpnR0h+yWm7vkODhLJlEKtzUrLyYQspmu5w+psyXRcHtrnK+IHB6K3LV3ofFvU5
DimohxMWZNo275CWJbvnWqwJRetj/MwuxpA/oRnT9rMtyt4mpPvMJWUdRk4W/izRbSqa8Gw+
i7zbp/4vDQtUmNNMwSt1MfQo+OiJBxsuczBVAEneDS3wkCRajw+tnRtNbMtRFG1jP6Ri0VFj
dFGmAAAAAAAA
--------------ms060509090305040301090800--
=========================================================================
Date:         Thu, 30 May 2013 15:22:18 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Fokke Dijkstra <[log in to unmask]>
Subject:      Re: Publication of MaxCPUTime and MaxWallTime glue attributes
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: multipart/signed; protocol="application/pkcs7-signature";
              micalg=sha1; boundary="------------ms000708060504060305070001"
Message-ID:  <[log in to unmask]>

This is a cryptographically signed message in MIME format.

--------------ms000708060504060305070001
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: quoted-printable

Hi,

On 29-05-13 12:13, Stephen Burke wrote:
> LHC Computer Grid - Rollout [mailto:[log in to unmask]] said:
>> And 2880 * 60 is 172800, so they've switched it to seconds - is that w=
eird and
>> pointless? I dunno.
> Yes, it has switched to seconds in GLUE 2 - whether weird and pointless=
 I'm not sure. In GLUE 1 it's inconsistent, some things are in seconds an=
d some in minutes, so in GLUE 2 it's standardised on seconds for everythi=
ng, but the first CREAM release still had minutes.
How can this ever work for the end user? I've just got a question from=20
someone who is trying to request a queue for his job with enough=20
wallclock time available. He ran into the issue that RUG-CIT and=20
SARA-MATRIX publish wallclock time in hours, where NIKHEF-ELPROD=20
publishes wallclock time in minutes. I'm quite sure that in the recent=20
past we published minutes as well at RUG-CIT.  Adding seconds into the=20
mix makes the problem even worse.

Kind regards,

Fokke

--=20
Fokke Dijkstra <[log in to unmask]>
High Performance Computing & Visualisation
Donald Smits Center for Information Technology, University of Groningen
Postbus 11044, 9700 CA  Groningen, The Netherlands
+31-50-363 9243



--------------ms000708060504060305070001
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature

MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJaTCC
BJ4wggOGoAMCAQICEQCXqISbh4s3AagBxIO27HljMA0GCSqGSIb3DQEBBQUAMDsxCzAJBgNV
BAYTAk5MMQ8wDQYDVQQKEwZURVJFTkExGzAZBgNVBAMTElRFUkVOQSBQZXJzb25hbCBDQTAe
Fw0xMjA5MjAwMDAwMDBaFw0xMzA5MjAyMzU5NTlaMGgxCzAJBgNVBAYTAk5MMSQwIgYDVQQK
ExtSaWprc3VuaXZlcnNpdGVpdCBHcm9uaW5nZW4xFDASBgNVBAMTC0YuIERpamtzdHJhMR0w
GwYJKoZIhvcNAQkCFg5QMTE0NjkxQHJ1Zy5ubDCCASIwDQYJKoZIhvcNAQEBBQADggEPADCC
AQoCggEBALoQ1X0PCu/h2c5DXkNCUSLtNXD4bnoRWWvY2KT9MQvvTbwe7arE9QOwueEpNzss
rHFxt0Svz6gwYJdPzBaAIgD7GLfvAXeUGe/hyiLnWrgfTY3vHXb2nmWMlyoWw1E/On+FS0IB
p/L+Hr9KcIULFsPI36uiWA00D2PiONeF+8BAwss+ReomCj9VrD6FRd4l/4sLDuqtHhfqWNkQ
JDgefgfKpnk2aqcB/Go3GEbPyD8gCaJ9V3cl4qvd/UJEQHLjeWSGnC5YBwvQF/g8ZQ8RF6+W
hCMHnBdZlY58Fog6h0WAUqm6rr4P4JrO12Y3xgZ4nCfMf5ZzMBJj7SFV7e2QqH8CAwEAAaOC
AW4wggFqMB8GA1UdIwQYMBaAFGNNQ1oZSD/ERsECur/uDuWCt2amMB0GA1UdDgQWBBTtqhCd
E7kJIDpdUlMkIQbgBr+5GTAOBgNVHQ8BAf8EBAMCBaAwDAYDVR0TAQH/BAIwADAdBgNVHSUE
FjAUBggrBgEFBQcDBAYIKwYBBQUHAwIwGAYDVR0gBBEwDzANBgsrBgEEAbIxAQICHTA/BgNV
HR8EODA2MDSgMqAwhi5odHRwOi8vY3JsLnRjcy50ZXJlbmEub3JnL1RFUkVOQVBlcnNvbmFs
Q0EuY3JsMHIGCCsGAQUFBwEBBGYwZDA6BggrBgEFBQcwAoYuaHR0cDovL2NydC50Y3MudGVy
ZW5hLm9yZy9URVJFTkFQZXJzb25hbENBLmNydDAmBggrBgEFBQcwAYYaaHR0cDovL29jc3Au
dGNzLnRlcmVuYS5vcmcwHAYDVR0RBBUwE4ERRi5EaWprc3RyYUBydWcubmwwDQYJKoZIhvcN
AQEFBQADggEBAFprNKWQeVNY1U4z125+QN5kW7yNVRLf+MhndvP3mPtW1ISlTGiIHmK0BHLm
gmfLXeuEDN04QYpdGbVAv1ddYwSQou8VX2/jk5Wb2+eHDJVYfQeNOzeoj1s2WrRWAzydCwXP
qLQwFQTrfqtq9ulMUyO/DwYpm34nZ3NrpgyrEOCnG0oRudLy2CaaN+B+CP6+5JyyWS2h407E
66Hm2U4ilFrVc5FBsfazEBnTfMGlQdvh/ZPUxwewrphfiGvDkBW9ghjPHz7XYoHKQL7ZbIik
avfpDGwNlUGHS5k+f6gn//840pxFpGa4hvhzkxaWhC6XfnqNVuVNSl5yU+Tqh9TTwycwggTD
MIIDq6ADAgECAhBz/lf637jFCIF7Zrlr8C3vMA0GCSqGSIb3DQEBBQUAMIGuMQswCQYDVQQG
EwJVUzELMAkGA1UECBMCVVQxFzAVBgNVBAcTDlNhbHQgTGFrZSBDaXR5MR4wHAYDVQQKExVU
aGUgVVNFUlRSVVNUIE5ldHdvcmsxITAfBgNVBAsTGGh0dHA6Ly93d3cudXNlcnRydXN0LmNv
bTE2MDQGA1UEAxMtVVROLVVTRVJGaXJzdC1DbGllbnQgQXV0aGVudGljYXRpb24gYW5kIEVt
YWlsMB4XDTA5MDUxODAwMDAwMFoXDTI4MTIzMTIzNTk1OVowOzELMAkGA1UEBhMCTkwxDzAN
BgNVBAoTBlRFUkVOQTEbMBkGA1UEAxMSVEVSRU5BIFBlcnNvbmFsIENBMIIBIjANBgkqhkiG
9w0BAQEFAAOCAQ8AMIIBCgKCAQEAyBXZ9TNqI6GQDc+7BUTDqx9KNYUaIYWgT/jwQOJKQ5v+
W7Gwv7RX3HWAQUtkGvbbT2+P0CVFNfnqy0r6+9rT7UWIEZQ25MyoDe/FPTftFnvjwpWeWDN/
Ivv4/+zmvtuuCmUlIofab4SLRuhAhig/v1YI4krpg6LpIvst+rYoH5HBw3H7U8ArTqQMoW6d
Ve3s4SSHOgjiDRzkxE3Qyyf6hGTm0ZedViRbk7spLkPiQWo94kpl/JpfWoaHvIfHeYCWmVHG
kA9kkZl9EN2sLAMq4Xhk/s49TvQrUBFL0VjUmwPwf/U7U7BTQ/vFL8QEKRo6rNdV6dEOldE7
MX94T64pLQIDAQABo4IBTTCCAUkwHwYDVR0jBBgwFoAUiYJnfcSdJnAAS7RQSHzePa4Ebn0w
HQYDVR0OBBYEFGNNQ1oZSD/ERsECur/uDuWCt2amMA4GA1UdDwEB/wQEAwIBBjASBgNVHRMB
Af8ECDAGAQH/AgEAMBgGA1UdIAQRMA8wDQYLKwYBBAGyMQECAh0wWAYDVR0fBFEwTzBNoEug
SYZHaHR0cDovL2NybC51c2VydHJ1c3QuY29tL1VUTi1VU0VSRmlyc3QtQ2xpZW50QXV0aGVu
dGljYXRpb25hbmRFbWFpbC5jcmwwbwYIKwYBBQUHAQEEYzBhMDgGCCsGAQUFBzAChixodHRw
Oi8vY3J0LnVzZXJ0cnVzdC5jb20vVVROQUFBQ2xpZW50X0NBLmNydDAlBggrBgEFBQcwAYYZ
aHR0cDovL29jc3AudXNlcnRydXN0LmNvbTANBgkqhkiG9w0BAQUFAAOCAQEABiupUy8T3Fw5
FsyGn15Me3L77I1Vil6aCv9TTHb0Bj1Qz1fwos+vmYyq/qAZdj6ZAzL6dYM4irtrmqUME7LU
G3bmlC5nmFnjkWwCkJqcyGBLVavKiFqNK+VplQMH0dQO/CQiLlmxY6Rf7dkjcuSczjpcbB9P
qQDJHf76f0Utti6E3Q8noFkYTtV2JUX0mSZ522+fI/dDuysPBKOBJiy3ezX5PXdfQCHmfx2l
llq90MsWOmy7YYuK/QQ5RArLLOHLzi4QmBrb4JPtSWRkCCCft6NQ8KLdyrTGfAw9514V3CeG
5Do7UloXq6kGUyudCXNkHAHD/TDShwNv5BUDejlfaDGCAwowggMGAgEBMFAwOzELMAkGA1UE
BhMCTkwxDzANBgNVBAoTBlRFUkVOQTEbMBkGA1UEAxMSVEVSRU5BIFBlcnNvbmFsIENBAhEA
l6iEm4eLNwGoAcSDtux5YzAJBgUrDgMCGgUAoIIBjzAYBgkqhkiG9w0BCQMxCwYJKoZIhvcN
AQcBMBwGCSqGSIb3DQEJBTEPFw0xMzA1MzAxMzIyMThaMCMGCSqGSIb3DQEJBDEWBBQCxVSn
l42wR4TA3OsAendDPESiETBfBgkrBgEEAYI3EAQxUjBQMDsxCzAJBgNVBAYTAk5MMQ8wDQYD
VQQKEwZURVJFTkExGzAZBgNVBAMTElRFUkVOQSBQZXJzb25hbCBDQQIRAJeohJuHizcBqAHE
g7bseWMwYQYLKoZIhvcNAQkQAgsxUqBQMDsxCzAJBgNVBAYTAk5MMQ8wDQYDVQQKEwZURVJF
TkExGzAZBgNVBAMTElRFUkVOQSBQZXJzb25hbCBDQQIRAJeohJuHizcBqAHEg7bseWMwbAYJ
KoZIhvcNAQkPMV8wXTALBglghkgBZQMEASowCwYJYIZIAWUDBAECMAoGCCqGSIb3DQMHMA4G
CCqGSIb3DQMCAgIAgDANBggqhkiG9w0DAgIBQDAHBgUrDgMCBzANBggqhkiG9w0DAgIBKDAN
BgkqhkiG9w0BAQEFAASCAQBkfN4KChE84zP5jDRumE5QZQJ7d4l6VF2JfSeMQ747l55YncAK
gUScX0iw9KBXfpiRCo14YZ8uG1u8Q153Xs0zo+eji0JqnAdKPBJzvuSGwJCqeqIF0BzRuynV
y48vmgH27zPEMXqhYgLuZlg0DJeZ8b6NzbStryTM4A2ywwBPOL5XW78OnRwP/GZjGRE6oVGv
ncX/g1pBlYiLCGEriBHj3nloYMkXAZK04R1CLP3ZQJJT3nO51mwuDfc3BmQelMAAdJ0yunlN
wLoULSqU97T/9W9FYVSbGHFOT1zktyFVktKDPSoCaOXuAhWhEzzyZKY0lmbO4xD4rwhzKVJ0
yFlLAAAAAAAA
--------------ms000708060504060305070001--
=========================================================================
Date:         Thu, 30 May 2013 13:26:56 +0000
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Stephen Burke <[log in to unmask]>
Subject:      Re: Publication of MaxCPUTime and MaxWallTime glue attributes
In-Reply-To:  <[log in to unmask]>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Message-ID:  <[log in to unmask]>

LHC Computer Grid - Rollout [mailto:[log in to unmask]]
> On Behalf Of Fokke Dijkstra said:
> How can this ever work for the end user? I've just got a question from
> someone who is trying to request a queue for his job with enough
> wallclock time available. He ran into the issue that RUG-CIT and
> SARA-MATRIX publish wallclock time in hours, where NIKHEF-ELPROD
> publishes wallclock time in minutes. I'm quite sure that in the recent
> past we published minutes as well at RUG-CIT.  Adding seconds into the
> mix makes the problem even worse.

The change is only for GLUE 2, GLUE 1 stays in minutes. The latest WMS does=
 support matching against GLUE 2 attributes but I don't think anyone is usi=
ng it yet, so hopefully people will have upgraded their CREAMs by the time =
it becomes relevant. Anyway publishing hours is definitely wrong - do they =
have some kind of customised info provider?

Stephen

-- 
Scanned by iCritical.
=========================================================================
Date:         Fri, 31 May 2013 15:05:47 +0100
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Winnie Lacesso <[log in to unmask]>
Subject:      Checklist for LCG site changing server sunet number
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Message-ID:  <[log in to unmask]>

Happy Friday!

For some reason or other (a good one) we may wish to change our
modest-sized LCG site from one subnet to another (still considering).

Has any LCG site done this & has a useful documented checklist? Things
to watch for, gotchas, etc? We tend to embed IP addrs in iptables,
hosts.allow & maybe a few other places - that we know of. A question
would be does LCG middleware embed IP addresses in places we'd rather
know of ahead of time.

Very grateful for any enlightenment or pointers. 



Winnie Lacesso / 55% HPC Storage Admin, 20% Particle Physics, 25% SysOps
HH Wills Physics Laboratory, Tyndall Avenue, Bristol, BS8 1TL, UK
University of Bristol
=========================================================================
Date:         Fri, 31 May 2013 16:19:30 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Jean-Michel Barbet <[log in to unmask]>
Subject:      Re: Checklist for LCG site changing server sunet number
Comments: cc: Winnie Lacesso <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Message-ID:  <[log in to unmask]>

On 05/31/2013 04:05 PM, Winnie Lacesso wrote:
> Happy Friday!
>
> For some reason or other (a good one) we may wish to change our
> modest-sized LCG site from one subnet to another (still considering).

Hello Winnie,

We did it on January 2012. I have a log of the operations but it is
in french and probably too specific. Most of the gotchas where how
to have Quattor managed machines (worker nodes) get their new IP,
do not start yet to reconfigure but be able to do it on startup on the
new network segment. We had no big problems changing the service nodes
(BDII, CREAM-CE, VOBOX, DPM-SE) from one IP to the new one.

Being allowed a bit of time and if you believe it useful, I could try
to extract general information from ou log. Are you using Quattor ?

JM


-- 
------------------------------------------------------------------------
Jean-michel BARBET                    | Tel: +33 (0)2 51 85 84 86
Laboratoire SUBATECH Nantes France    | Fax: +33 (0)2 51 85 84 79
CNRS-IN2P3/Ecole des Mines/Universite | E-Mail: [log in to unmask]
------------------------------------------------------------------------
=========================================================================
Date:         Fri, 31 May 2013 15:35:11 +0100
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Winnie Lacesso <[log in to unmask]>
Subject:      Re: Checklist for LCG site changing server sunet number
Comments: To: Jean-Michel Barbet <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Message-ID:  <[log in to unmask]>

Bonjour!

No, we don't have quattor. Your statement that you had no big problems 
with the service nodes brings relief (you didn't find LCG middleware had 
hidden an IP address somewhere).

If you would extract some general information in a timeslice, that would
benefit myself & anyone else who might in future go thru similar.
But don't want to take up much of your valuable time.
=========================================================================
Date:         Fri, 31 May 2013 17:41:25 +0200
Reply-To:     LHC Computer Grid - Rollout <[log in to unmask]>
Sender:       LHC Computer Grid - Rollout <[log in to unmask]>
From:         Jean-Michel Barbet <[log in to unmask]>
Subject:      Re: Checklist for LCG site changing server sunet number
Comments: cc: Winnie Lacesso <[log in to unmask]>
In-Reply-To:  <[log in to unmask]>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Message-ID:  <[log in to unmask]>

On 05/31/2013 04:35 PM, Winnie Lacesso wrote:

> If you would extract some general information in a timeslice, that would
> benefit myself&  anyone else who might in future go thru similar.

Here you are, some details may be missing but basically that's it.

---------------------------------------------------------------------------

Involved services :

     * xrootd manager and xrootd servers (~20)
     * Vmware virtual machines : BDII,VOBOX,CREAM(2),ARGUS server
       Torque server, DPM head+disk node
     * Worker nodes (~40)

Preparation

     A week before :
     * Notify the VOs
     * Declare GOC-DB Downtime from Day-1 17:00 to Day-2 17:00 (24h)
     * Notify NGI and all partners

     The day before Day-1 around 17:00 :
     * Drain CREAM-CEs (Alice jobs run for about 24h)
     * Prepare Quattor profiles
     * Prepare firewall filters for the new network segment
     * Create new virtual switch for the new network on VMware servers

Day-1 at 17:00

     Stop services
     * Stop xrootd services
     * Change IP of VMware virtual machines that support all services
       and stop them
     * Reconfigure the virtual network so that VMs are connected to the
       right network segment
     * Modify the DNS (modification will spread during the night)

Day-2 at 7:30

     Start services
     * Start xrootd servers, change IP, restart and start xrootd
       services (servers and manager)
     * Start service nodes : BDII,Torque server, CREAM, VOBOX, Argus,
     * Modify IPs of worker nodes and restart them (see comment below)
     Checks (many with the help of the NGI's nagios box)

Post-install

     * We had to give the new addresses of Torque server and CREAMs
       to the administrator of the MonBox since they filter the clients
       on the base of their IPs
     * Modify our local Nagios to take into account the new IPs

Comment : Here I do not give details on how to deal with an IP
modification for the Quattor managed worker nodes.

Comment : At the same time, we changed the NFS server used by worker nodes

---------------------------------------------------------------------------

So, you see, it was pretty straighforward for us...

JM


-- 
------------------------------------------------------------------------
Jean-michel BARBET                    | Tel: +33 (0)2 51 85 84 86
Laboratoire SUBATECH Nantes France    | Fax: +33 (0)2 51 85 84 79
CNRS-IN2P3/Ecole des Mines/Universite | E-Mail: [log in to unmask]
------------------------------------------------------------------------