Maarten Litmaath пишет:
> Dear RLS Team,
> where are those requests coming from? Perhaps a single bad application
> is responsible for the whole mess!
I am not aware of any _new_ application that would overload RLS. ATLAS is in the final stages of production, and RLS is being used heavily by the jobs (via LCG own tools) and by the data management/movement tool (Don Quijote). We can't afford stopping the production, especially now. If the database managers can point out a single user (I assume there is such info in RLS logs) causing this trouble, we can check what's going on. Also, it would be nice if ATLAS production and computing coordinators are informed, too.
Oxana
>
> ________________________________
>
> From: Dirk Duellmann
> Sent: Thu 5/5/2005 5:04 PM
> To: users-rls (users of the CERN rls)
> Cc: James Casey; Jamie Shiers; Guido Negri; Miguel Anjo
> Subject: Re: RLS information
>
>
>
> Dear All,
>
> we have restarted the ATLAS RLS already several times but until
> either the number of request
> is decreased on the ATLAS side or the RLS application is improved a
> stable service can
> not be achieved. Please let us know if ATLAS could limit the number
> of RLS request so
> that at least some useful work can be done and other users of the RLS
> database are not affected.
>
> Cheers, Dirk
>
> On 4 May 2005, at 20:06, Miguel Anjo wrote:
>
>
>>Dear users,
>>
>>After spending more than 15 hours with the problem (since 4am this
>>morning), the database team that supports RLS made the application the
>>most stable possible on the database and application server side for
>>which we are responsible to cope with the unexpected increased
>>workload
>>on the system.
>>
>>This application is performing several 'SELECT COUNT(*)' over very
>>large
>>tables and queries using the 'LIKE' keyword on the WHERE clause that
>>causes the database to read everytime tables of about 600MB. The
>>machine
>>where the database is has runned out of memory and the queries slowed
>>down.
>>
>>Other problem on the RLS application is the lack of support of bulk
>>inserts which obliges to perform a commit after a single insert and a
>>physical write to disk, causing many db log sync wait events,
>>impossible
>>to overpass.
>>
>>As consequence the load performed by the enourmous amount of calls to
>>the RLS application makes the application server unavailable for more
>>connections (as it is waiting for the database).
>>
>>The only way at the momment to resolve the problem would be fixing the
>>'bugs' existent in the RLS application, which is not in our hands (and
>>anyway RLS is not developed anymore).
>>
>>Workarounds possible to attenuate the load include obviously
>>decreasing
>>the number of calls to RLS (as it was requested this morning) and
>>escape
>>the '_' character in the filenames with '\_' so the query uses an
>>'WHERE
>>filename =' instead of 'WHERE filename LIKE', performing the query
>>using
>>an index.
>>
>>Cheers,
>> Oracle support team
>>
|