Print

Print


Hi Laurence,

Two comments:
1. Windows XP SP2 was no simple security patch, and was clearly marked 
as such. At most larger sites, the service pack has therefore not been 
applied  without careful testing.
2. Fedora is some sort of Redhat beta, and probably not what I would use 
for critical services. Furthermore I wouldn't be surprised if Redhat 
Enterprise Linux updates get more testing than updates for SLC made by CERN.

Fokke

Laurence Field wrote:
> Hi Fokke,
>
> It can happen with any distribution, can you remember what fun we had 
> with Windows XP and service pack 2 where it even broke Microsoft 
> products. Fedora updates, would break random things daily and I am 
> very glad that no sites are using Fedora!
>
> Laurence
>
> Fokke Dijkstra wrote:
>> Actually, this did not happen (and should not have happened either) 
>> with Redhat, Scientific Linux and CentOS. The problem only appeared 
>> in Scientific Linux *CERN*, and should not have happened there 
>> either. If you want to supply a Redhat Enterprise Linux  compatible 
>> distribution, you should adhere to Redhats policy of not doing any 
>> major version changes within a given version of the OS. It seems SLC 
>> does not do this,  and the consequences have been seen yesterday.
>> This brings up the question if SLC is the best OS for running most of 
>> the EGEE infrastructure on.
>>
>> Fokke Dijkstra
>>
>>
>> Laurence Field wrote:
>>> This could have happened with any of the distributions below. 
>>> Redhat, Suse, Debian, Ubuntu etc. will do updates and it is 
>>> understandable that a site will want to install security updates as 
>>> soon as possible.  If this update causes an interaction with other 
>>> software which creates a problem then that is unfortunate.  This 
>>> problem was spotted by the SFTs and also showed up on the Testbed, 
>>> however, I don't think we can realistically prevent this kind of 
>>> problem.
>>>
>>> Laurence
>>>
>>> ldapsearch -x -h lcg-bdii -p 2170 -b o=grid | grep 
>>> GlueHostOperatingSystemName | sort -u
>>> GlueHostOperatingSystemName: CentOS
>>> GlueHostOperatingSystemName: Debian
>>> GlueHostOperatingSystemName: linux-rhel-3
>>> GlueHostOperatingSystemName: linux-rocks-3.3
>>> GlueHostOperatingSystemName: linux-rocks-4.1
>>> GlueHostOperatingSystemName: linux-sl-fermi-3.0
>>> GlueHostOperatingSystemName: Redhat
>>> GlueHostOperatingSystemName: RedHatEnterpriseAS
>>> GlueHostOperatingSystemName: Scientific Linux
>>> GlueHostOperatingSystemName: Scientific Linux CERN
>>> GlueHostOperatingSystemName: Scientific Linux SL
>>> GlueHostOperatingSystemName: ScientificSL
>>> GlueHostOperatingSystemName: SUSE LINUX
>>> GlueHostOperatingSystemName: Ubuntu
>>>
>>>
>>>
>>> Kalman Kovari wrote:
>>>> Hi,
>>>>
>>>>  
>>>>> don't forget to be real, people: the problem was caused by an 
>>>>> interaction between a *bug fix* in ssh and an *unfixed, dormant 
>>>>> bug* in YAIM.  These kinds of situations are rather difficult to 
>>>>> detect, until they are triggered.
>>>>>     
>>>>
>>>> Yep. That's why one would need a
>>>> "yaim-installed-slc3-based-gLite-running-small-test-gridsite" to test
>>>> the release candidates before approving it towards the grid. Is 
>>>> that so
>>>> unreal? Plus 5 machines to a testbed?
>>>>
>>>> K
>>>>
>>>>  
>>>>>     J "or did you forget the apostrophe in the comment story" T
>>>>>
>>>>> Kalman Kovari wrote:
>>>>>  
>>>>>> Hi Nicholas,
>>>>>>
>>>>>>    
>>>>>>> The update was an OS update, not a middleware update, therefore 
>>>>>>> it's out
>>>>>>> of the control of EGEE and WLCG.  If gLite ran on Windows, would we
>>>>>>> expect Microsoft to give us (EGEE grid) an individual warning of a
>>>>>>> security patch?
>>>>>>>         
>>>>>> Would we be the 'biggest consumer' of Microsoft? In that case, I 
>>>>>> would
>>>>>> expect them to consider our needs...
>>>>>>
>>>>>> If we want to avoid another issue like this, the choices are on 
>>>>>> the long
>>>>>> run either to set up an own (gLite or EGEE based) commitee to 
>>>>>> control
>>>>>> the repository updates (by setting up our own repo, or by advising
>>>>>> sysadmins only to upgrade on the commitee's approval of the new 
>>>>>> sw), OR
>>>>>> to convince the SLC3 release responsibles to RESPECT the needs of 
>>>>>> our
>>>>>> services, and to trust them. The first case would be a big work, 
>>>>>> and a
>>>>>> lot of delay on security updates. In the later case their testing 
>>>>>> team
>>>>>> would have a bit more work (another testing environment maybe), 
>>>>>> and we
>>>>>> could even trust the auto-updates.
>>>>>>
>>>>>> Best Regards,
>>>>>>  Kalman Kovari
>>>>>>       
>>
>>


-- 
Fokke Dijkstra
High Performance Computing & Visualisation
RC, Informatie- & CommunicatieTechnologie
Postbus 11044
9700 CA  Groningen
+31-50-363 9243