Hi
I would not be surprised it Rod was right (although I am not sure what a
'punter' is). I've seen a few ATLAS users write their own personal
scripts to massage the RLS because Donkey Shot (sorry I can't spell in
Spanish) isn't working correctly for them. Since there is also the very
confusing difference between various sorts of GUIDs (whose idea was
that?) and this is not explained very well in the ATLAS documentation,
ATLAS private script writer people understandably tend to leave lots of
dangling interesting stuff in the catalog. It's usually easy enough to
write a silly program that results in a DoS, but the above factors
actually encourage people to write such programs.
JT
Oxana Smirnova wrote:
> Maarten Litmaath пишет:
>
>> Dear RLS Team,
>> where are those requests coming from? Perhaps a single bad application
>> is responsible for the whole mess!
>
>
> I am not aware of any _new_ application that would overload RLS. ATLAS
> is in the final stages of production, and RLS is being used heavily by
> the jobs (via LCG own tools) and by the data management/movement tool
> (Don Quijote). We can't afford stopping the production, especially now.
> If the database managers can point out a single user (I assume there is
> such info in RLS logs) causing this trouble, we can check what's going
> on. Also, it would be nice if ATLAS production and computing
> coordinators are informed, too.
>
> Oxana
>
>>
>> ________________________________
>>
>> From: Dirk Duellmann
>> Sent: Thu 5/5/2005 5:04 PM
>> To: users-rls (users of the CERN rls)
>> Cc: James Casey; Jamie Shiers; Guido Negri; Miguel Anjo
>> Subject: Re: RLS information
>>
>>
>>
>> Dear All,
>>
>> we have restarted the ATLAS RLS already several times but until either
>> the number of request
>> is decreased on the ATLAS side or the RLS application is improved a
>> stable service can
>> not be achieved. Please let us know if ATLAS could limit the number of
>> RLS request so
>> that at least some useful work can be done and other users of the RLS
>> database are not affected.
>>
>> Cheers, Dirk
>>
>> On 4 May 2005, at 20:06, Miguel Anjo wrote:
>>
>>
>>> Dear users,
>>>
>>> After spending more than 15 hours with the problem (since 4am this
>>> morning), the database team that supports RLS made the application the
>>> most stable possible on the database and application server side for
>>> which we are responsible to cope with the unexpected increased workload
>>> on the system.
>>>
>>> This application is performing several 'SELECT COUNT(*)' over very large
>>> tables and queries using the 'LIKE' keyword on the WHERE clause that
>>> causes the database to read everytime tables of about 600MB. The machine
>>> where the database is has runned out of memory and the queries slowed
>>> down.
>>>
>>> Other problem on the RLS application is the lack of support of bulk
>>> inserts which obliges to perform a commit after a single insert and a
>>> physical write to disk, causing many db log sync wait events, impossible
>>> to overpass.
>>>
>>> As consequence the load performed by the enourmous amount of calls to
>>> the RLS application makes the application server unavailable for more
>>> connections (as it is waiting for the database).
>>>
>>> The only way at the momment to resolve the problem would be fixing the
>>> 'bugs' existent in the RLS application, which is not in our hands (and
>>> anyway RLS is not developed anymore).
>>>
>>> Workarounds possible to attenuate the load include obviously decreasing
>>> the number of calls to RLS (as it was requested this morning) and escape
>>> the '_' character in the filenames with '\_' so the query uses an 'WHERE
>>> filename =' instead of 'WHERE filename LIKE', performing the query using
>>> an index.
>>>
>>> Cheers,
>>> Oracle support team
>>>
|