On Thu, 14 Oct 2010, Kashif Mohammad wrote:
> One more thing to add that I am using latest cream version and using new blparser.
In this case in the worst case, after:
# about 2 months of purge interval.
purge_interval=2500000
each job should reach a terminal status.
If this is not the case, there is a problem.
I will contact you off-list to debug such issue
Cheers, Massimo
>
> Regards
> Kashif
>
> -----Original Message-----
> From: LHC Computer Grid - Rollout [mailto:[log in to unmask]] On Behalf Of Kashif Mohammad
> Sent: 14 October 2010 15:44
> To: [log in to unmask]
> Subject: Re: [LCG-ROLLOUT] Cleaning Creamdb database
>
> Hi Massimo
>
>>> - purge all jobs in status x for more than y ?
>
> Yes, I also want some thing like this. I can generate a list of CREAMID by directly querying DB , like
> echo -e "USE creamdb; \n SELECT jobId FROM job_status WHERE time_stamp < '2010-09-30 09:31:22';" | mysql -p >> jobtopurge
>
> but I can not supply the file to /opt/glite/sbin/JobDBAdminPurger.sh. It would be nice if this script may have option of taking list of jobid's in a file.
>
> Regards
> Kashif
>
> -----Original Message-----
> From: LHC Computer Grid - Rollout [mailto:[log in to unmask]] On Behalf Of Massimo Sgaravatto - INFN Padova
> Sent: 14 October 2010 15:05
> To: [log in to unmask]
> Subject: Re: [LCG-ROLLOUT] Cleaning Creamdb database
>
> On Thu, 14 Oct 2010, Daniela Bauer wrote:
>
>> Hi Massimo,
>>
>> but I can't purge all jobs in 'RUNNING/REALLY-RUNNING' state as this
>> would purge the ones that are actually still running as well, no ? I
>> only want to kill the 1000+ 'running' ones that are older than a week
>> and as far as I can tell the script only does that if I give it the
>> jobids.
>
>
> So basically you would like to specify something like:
>
> - purge all jobs in status x for more than y ?
>
>
> Cheers, Massimo
>
>> Apparently I am not using the latest configuration, as far as I can
>> tell I reran yaim last on Aug18th.
>>
>> Cheers,
>>
>> Daniela
>>
>>
>> On 14 October 2010 13:53, Massimo Sgaravatto - INFN Padova
>> <[log in to unmask]> wrote:
>>> On Thu, 14 Oct 2010, Daniela Bauer wrote:
>>>
>>>> Apparently I have about 1800 of these jobs lying around - do I really
>>>> have to extract all the IDs and give them to a script ???
>>>
>>> Hi Daniela
>>>
>>> As documented in that page, you can also specifies as input a list of states
>>> instead of the jobids.
>>> Which other "input method" would you like to be supported ?
>>>
>>>
>>>
>>>> (And yes I
>>>> am running the latest and greatest version of cream).
>>>
>>> I didn't ask if you are running the latest version of CREAM, but if you
>>> configured using the new blparser or the old one
>>> See:
>>> http://grid.pd.infn.it/cream/field.php?n=Main.CREAMAndBlparserConfiguration
>>>
>>> If you have an entry such as:
>>>
>>> job_registry=/opt/glite/var/blah/user_blah_job_registry.bjr
>>>
>>> in /opt/glite/etc/blah.config you are using the new one.
>>> Otherwise you are using the old one
>>>
>>>
>>>
>>> Cheers, Massimo
>>>
>>>> I could just
>>>> drain the CE and recreate the database, that might be faster ;-)
>>>>
>>>> Daniela
>>>>
>>>> On 14 October 2010 13:31, Massimo Sgaravatto - INFN Padova
>>>> <[log in to unmask]> wrote:
>>>>>
>>>>> On Thu, 14 Oct 2010, Kashif Mohammad wrote:
>>>>>
>>>>>> Hi
>>>>>>
>>>>>> Sometime we are seeing exceptionally high load on our cream
>>>>>> server(between
>>>>>> 30-40). I checked
>>>>>> /opt/glite/bin/glite_cream_load_monitor --show
>>>>>> and it is showing "Detected value for Number of active jobs: 646"
>>>>>> although
>>>>>> currently only 108 jobs are either running or queuing at our cream ce.
>>>>>>
>>>>>> I also checked creamdb and it is showing some very old entries
>>>>>>
>>>>>> mysql> SELECT type, exitCode, time_stamp, jobId FROM job_status WHERE
>>>>>> type=4 ORDER BY time_stamp ASC LIMIT 10;
>>>>>> +------+----------+---------------------+----------------+
>>>>>> | type | exitCode | time_stamp | jobId |
>>>>>> +------+----------+---------------------+----------------+
>>>>>> | 4 | NULL | 2010-08-21 19:04:04 | CREAM794724118 |
>>>>>> | 4 | NULL | 2010-08-21 19:49:15 | CREAM929639932 |
>>>>>> | 4 | NULL | 2010-08-21 20:00:51 | CREAM617017956 |
>>>>>> | 4 | NULL | 2010-08-21 20:54:19 | CREAM840522948 |
>>>>>> | 4 | NULL | 2010-08-21 20:54:27 | CREAM380921149 |
>>>>>> | 4 | NULL | 2010-08-21 20:54:28 | CREAM975255786 |
>>>>>> | 4 | NULL | 2010-08-21 20:54:28 | CREAM016815277 |
>>>>>> | 4 | NULL | 2010-08-21 20:54:35 | CREAM681214827 |
>>>>>> | 4 | NULL | 2010-08-21 20:54:38 | CREAM724243214 |
>>>>>> | 4 | NULL | 2010-08-21 20:54:39 | CREAM222735775 |
>>>>>> +------+----------+---------------------+----------------+
>>>>>> 10 rows in set (0.03 sec)
>>>>>>
>>>>>>
>>>>>> 2010-08-21 is the time when I updated creamce and had some problem so
>>>>>> probably these jobs got stuck in some weird state but in creamdb they
>>>>>> are
>>>>>> still in RUNNING and REALLY-RUNNING state.
>>>>>> My question is that how to clean these entries safely ?, can I add some
>>>>>> thing like " RUNNING 15 DAYS " in JOB_PURGE_POLICY .
>>>>>
>>>>>
>>>>> No, because automatic purging works only for jobs in terminal status
>>>>>
>>>>> To purge jobs in the other states, see:
>>>>>
>>>>>
>>>>> http://grid.pd.infn.it/cream/field.php?n=Main.HowToPurgeJobsFromTheCREAMDB
>>>>>
>>>>>
>>>>> PS: are you using the new blah blparser ? In this case it should be quite
>>>>> unlikely to have jobs stuck *forever* in a non-terminal status
>>>>>
>>>>>
>>>>> Cheers, Massimo
>>>>>
>>>>>>
>>>>>> Regards
>>>>>> Kashif
>>>>>>
>>>>>
>>>>> \|||/
>>>>> -----------0oo----( o o )----oo0-------------------
>>>>> (_)
>>>>> INFN Sezione di Padova
>>>>> Via Marzolo, 8
>>>>> 35131 Padova - Italy E-mail: massimo.sgaravatto [at] pd.infn.it
>>>>> Tel: ++39 0498275908 Skype: massimo.sgaravatto
>>>>> Fax: ++39 0498275952
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> -----------------------------------------------------------
>>>> [log in to unmask]
>>>> HEP Group/Physics Dep
>>>> Imperial College
>>>> Tel: +44-(0)20-75947810
>>>> http://www.hep.ph.ic.ac.uk/~dbauer/
>>>>
>>>
>>> \|||/
>>> -----------0oo----( o o )----oo0-------------------
>>> (_)
>>> INFN Sezione di Padova
>>> Via Marzolo, 8
>>> 35131 Padova - Italy E-mail: massimo.sgaravatto [at] pd.infn.it
>>> Tel: ++39 0498275908 Skype: massimo.sgaravatto
>>> Fax: ++39 0498275952
>>>
>>>
>>>
>>>
>>
>>
>>
>> --
>> -----------------------------------------------------------
>> [log in to unmask]
>> HEP Group/Physics Dep
>> Imperial College
>> Tel: +44-(0)20-75947810
>> http://www.hep.ph.ic.ac.uk/~dbauer/
>>
>
> \|||/
> -----------0oo----( o o )----oo0-------------------
> (_)
> INFN Sezione di Padova
> Via Marzolo, 8
> 35131 Padova - Italy E-mail: massimo.sgaravatto [at] pd.infn.it
> Tel: ++39 0498275908 Skype: massimo.sgaravatto
> Fax: ++39 0498275952
>
\|||/
-----------0oo----( o o )----oo0-------------------
(_)
INFN Sezione di Padova
Via Marzolo, 8
35131 Padova - Italy E-mail: massimo.sgaravatto [at] pd.infn.it
Tel: ++39 0498275908 Skype: massimo.sgaravatto
Fax: ++39 0498275952
|