JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for DIRAC-USERS Archives


DIRAC-USERS Archives

DIRAC-USERS Archives


DIRAC-USERS@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

DIRAC-USERS Home

DIRAC-USERS Home

DIRAC-USERS  January 2016

DIRAC-USERS January 2016

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Re: Backup proposal -option 3

From:

Lydia Heck <[log in to unmask]>

Reply-To:

Lydia Heck <[log in to unmask]>

Date:

Mon, 4 Jan 2016 18:38:01 +0000

Content-Type:

TEXT/PLAIN

Parts/Attachments:

Parts/Attachments

TEXT/PLAIN (377 lines)

HI Jens,

there are a few hold ups:

(a) some large tars failed because they were simply to big: the test for the
size needs to be refined.

(b) some tars ~15 failed because of time outs to the server. I have just now
started those up

(c) the tars do take time and the more files there are the more time it takes.
So while I just saw ~400 MByte/sec this has to be spread over the empty time of 
tar'ing up as well.

Yes, I will need to update the document and will have to pass the script to
Jon.

Best wishes,
Lydia


On Mon, 4 Jan 2016, Jensen, Jens (STFC,RAL,SC) wrote:

> Hi Lydia,
>
> Yes, merry new year. Thanks for keeping RAL busy during the break!
>
> Brian and I have been discussing your DiRAC transfers all morning today
> doing mostly approximate in-our-heads calculations on how much data we
> expect (e.g. holiday=10^6 seconds, filesize=250GB, that sort of stuff).
> I seem to remember Brian saying 250MB/s but overall volume is only about
> 120TB which is 50% duty cycle - but still on the scale of
> everything-gone-before, so all right. Brian is doing more precise
> calculations at the moment. I had a look at CERN's FTS aggregator but
> oddly couldn't find the data which is odd and curious and strange.
>
> Brian is slightly concerned about moving 250GB over lesser networks than
> the one currently connecting Durham to RAL which had us discussing
> GridFTP restart markers and whether they are supported. But let's keep
> going, and get Jon started moving stuff, too.
>
> Lydia, do we need to update your document now that you have a new script
> and new recipes? Will your script be available for the other sites, or
> should we discuss options with the others in case they want to do their
> own thing?
>
> Cheers
> --jens
>
> On 04/01/2016 17:53, Lydia Heck wrote:
>>
>> Dear all,
>>
>> Happy new year and all that ....
>>
>> Over the Christmas break I have archive 265 tar files and ther are 36
>> failures.
>>
>> I know why the transfers failed and the script needs some further
>> modifications.
>>
>> But all in all, I think this has been a success so far.
>>
>> I will get back to the script on Wednesday and I think by then the
>> first project might have been dealt with - subject to clearing up the
>> failures.
>>
>> Lydia
>>
>>
>>
>> On Thu, 24 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote:
>>
>>> good - this'll be useful for the next site (ie Leicester, who should be
>>> about ready to go...)
>>>
>>> Cheers
>>> -j
>>>
>>> On 24/12/2015 14:02, Lydia Heck wrote:
>>>> Sorry forgot some important information:
>>>>
>>>> As I am doing the tar as root on the file server, all ownership, time
>>>> stamps etc
>>>> are fully conserved.
>>>>
>>>> Lydia
>>>>
>>>>
>>>> On Thu, 24 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote:
>>>>
>>>>> OK; let me know how you're doing.
>>>>>
>>>>> Cheers
>>>>> --jens
>>>>>
>>>>> On 23/12/2015 18:30, Lydia Heck wrote:
>>>>>>
>>>>>> Hi Jens,
>>>>>>
>>>>>> I need to add one more functionality to the script then I am ready.
>>>>>> That will happen tomorrow. Then I will keep on going ....
>>>>>>
>>>>>> Lydia
>>>>>>
>>>>>>
>>>>>> On Wed, 23 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote:
>>>>>>
>>>>>>> Hi Lydia,
>>>>>>>
>>>>>>> Great stuff. So you will be moving data over the Christmas break?
>>>>>>> This
>>>>>>> will be good...
>>>>>>>
>>>>>>> ... we also need to get Jon started; whether he wants to run your
>>>>>>> script, too, or do something else. And we need to clear out the old
>>>>>>> data, but I'd ask Brian to look into that once he's back in the new
>>>>>>> year.
>>>>>>>
>>>>>>> Merry and Happy to you too.
>>>>>>>
>>>>>>> Cheers
>>>>>>> -j
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 23/12/2015 15:01, Lydia Heck wrote:
>>>>>>>> Hi Jens and all,
>>>>>>>>
>>>>>>>> I have a script now that tars up directories per DiRAC project into
>>>>>>>> tar files of specific size. Once one tar file is complete it is
>>>>>>>> archived to RAL. Once the archive is complete the tar file is
>>>>>>>> deleted
>>>>>>>> and the next set  of files is being archived.
>>>>>>>>
>>>>>>>> The chunk size at present is 256 GByte or slightly bigger
>>>>>>>> depends on
>>>>>>>> the size of files.
>>>>>>>>
>>>>>>>> The transfer of such a file takes ~15 minutes.
>>>>>>>>
>>>>>>>> The script needs some polishing and once I am totally happy I
>>>>>>>> can run
>>>>>>>> it non-interactively. I currently still have an interactive
>>>>>>>> element in
>>>>>>>> the script as the last debugging and other idea stages are not
>>>>>>>> fully
>>>>>>>> completed.
>>>>>>>>
>>>>>>>> Merry Christmas and a Happy New year.
>>>>>>>>
>>>>>>>> Lydia
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>  On Tue, 22 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote:
>>>>>>>>
>>>>>>>>> Hooray for downgrading!
>>>>>>>>>
>>>>>>>>> On 22/12/2015 13:49, Lydia Heck wrote:
>>>>>>>>>>
>>>>>>>>>> Done it. I have down-graded to 3.3.3-x and now the lot works.
>>>>>>>>>>
>>>>>>>>>> Lydia
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, 22 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote:
>>>>>>>>>>
>>>>>>>>>>> Weird! Which version are you using?
>>>>>>>>>>>
>>>>>>>>>>> We seem to have fts-rest-3.3.3-2 and fts-rest-cli-3.3.3 and
>>>>>>>>>>> fts-rest-cloud-storage-3.3.3 and python-fts-3.3.3 but every
>>>>>>>>>>> other
>>>>>>>>>>> fts
>>>>>>>>>>> package on the server is 3.3.2. (There is both a python-fts
>>>>>>>>>>> and an
>>>>>>>>>>> fts-python - weird).
>>>>>>>>>>>
>>>>>>>>>>> Cheers
>>>>>>>>>>> --jens
>>>>>>>>>>>
>>>>>>>>>>> On 22/12/2015 12:22, Lydia Heck wrote:
>>>>>>>>>>>> Hi Jens,
>>>>>>>>>>>>
>>>>>>>>>>>> I have a script that I could test. However I now have an issue
>>>>>>>>>>>> that the
>>>>>>>>>>>>
>>>>>>>>>>>> fts-transfer command does not work anymore with the error
>>>>>>>>>>>> message
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> fts client is connecting using the gSOAP interface. Consider
>>>>>>>>>>>> changing
>>>>>>>>>>>>           your configured fts endpoint port to select the REST
>>>>>>>>>>>> interface
>>>>>>>>>>>>
>>>>>>>>>>>> I am currently rebooting the system, but have you seen
>>>>>>>>>>>> something
>>>>>>>>>>>> similar once before?
>>>>>>>>>>>>
>>>>>>>>>>>> Best wishes,
>>>>>>>>>>>> Lydia
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, 18 Dec 2015, Jens Jensen wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Right, so the find suggestion at least would do a depth first
>>>>>>>>>>>>> listing of
>>>>>>>>>>>>> files-to-add, and tar I am guessing would also add files depth
>>>>>>>>>>>>> first,
>>>>>>>>>>>>> which I think meets your requirement, or close enough, of
>>>>>>>>>>>>> putting
>>>>>>>>>>>>> related files into the same chunk.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Using find-and-then-tar you could avoid building the following
>>>>>>>>>>>>> archive
>>>>>>>>>>>>> until the current one has been sent off to RAL. You'd just
>>>>>>>>>>>>> need
>>>>>>>>>>>>> space
>>>>>>>>>>>>> for the filelist.
>>>>>>>>>>>>>
>>>>>>>>>>>>> What I am thinking is:
>>>>>>>>>>>>> 1. find <folder to be backed up> -newer <timestamp file>
>>>>>>>>>>>>> |<list
>>>>>>>>>>>>> size and
>>>>>>>>>>>>> full filename> >filelist
>>>>>>>>>>>>> 2. Walk through filelist one line at a time adding up sizes
>>>>>>>>>>>>> and
>>>>>>>>>>>>> filenames till a certain threshold size has been exceeded (say
>>>>>>>>>>>>> 20GB or
>>>>>>>>>>>>> 100,000 files, whichever comes firsts) or adding the next file
>>>>>>>>>>>>> will
>>>>>>>>>>>>> take
>>>>>>>>>>>>> us above a higher threshold (say 50GB)
>>>>>>>>>>>>> 3. Once a list has been found, tar it up, compress it,
>>>>>>>>>>>>> optionally
>>>>>>>>>>>>> store
>>>>>>>>>>>>> the contents (list) somewhere, send the tarball to RAL, and
>>>>>>>>>>>>> then
>>>>>>>>>>>>> delete it.
>>>>>>>>>>>>> 4. Go back to step 2 until the filelist has been completed.
>>>>>>>>>>>>> 5. Touch the timestamp file
>>>>>>>>>>>>> 6. sleep 24 hours (or whatever) and go to step 1.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This would meet all our requirements and would be stupidly
>>>>>>>>>>>>> easy
>>>>>>>>>>>>> to do.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers
>>>>>>>>>>>>> --jens
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 17/12/2015 12:43, Lydia Heck wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Jens,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> it took longer than I thought to tidy up the results from the
>>>>>>>>>>>>>> meeting
>>>>>>>>>>>>>> last week (I spent a full day on a spreadsheet :-) )
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> However I am now going to look at the transfers again.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I looked over the presentation you shared with us. And yes,
>>>>>>>>>>>>>> that is
>>>>>>>>>>>>>> the way it should go. There are some provisos:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If I create 3 TB chunks, I need to have space for several of
>>>>>>>>>>>>>> them:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> One being transfered, one in waiting and one being prepared.
>>>>>>>>>>>>>> This
>>>>>>>>>>>>>> will
>>>>>>>>>>>>>> add 10 TB to the storage that is not available for the users;
>>>>>>>>>>>>>> can be
>>>>>>>>>>>>>> done, but needs to be factored in.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If there is indeed a failure, then I need to identify
>>>>>>>>>>>>>> where the
>>>>>>>>>>>>>> data
>>>>>>>>>>>>>> are that have been deleted, corrupted or whatever. If I
>>>>>>>>>>>>>> "just"
>>>>>>>>>>>>>> chunk
>>>>>>>>>>>>>> the whole filesystem, that would be difficult, if not
>>>>>>>>>>>>>> impossible to
>>>>>>>>>>>>>> find. So I would need to arrange transfers by project, and
>>>>>>>>>>>>>> even
>>>>>>>>>>>>>> then
>>>>>>>>>>>>>> the retrieval might physically not be possible, depending of
>>>>>>>>>>>>>> how
>>>>>>>>>>>>>> many
>>>>>>>>>>>>>> of the chunks I would have to retrieve.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I believe that currently the biggest top folder is ~500 TB.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> There would not be lots of jobs running, simply because
>>>>>>>>>>>>>> there is
>>>>>>>>>>>>>> not
>>>>>>>>>>>>>> enough space to chunk that much.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On the storage that I would like to archive there are more
>>>>>>>>>>>>>> than 64M
>>>>>>>>>>>>>> files.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So would  a "flat" chunking tar of all the filesystem be a
>>>>>>>>>>>>>> "good"
>>>>>>>>>>>>>> idea? I am not sure.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I need to think about this a bit more.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Lydia
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  On Thu, 10 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Lydia,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> That's great. I am actually on leave tomorrow (travelling)
>>>>>>>>>>>>>>> and out
>>>>>>>>>>>>>>> Monday (at Royal Holloway) but the others on the list can
>>>>>>>>>>>>>>> follow up.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>> --jens
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 10/12/2015 10:21, Lydia Heck wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Dear all,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> sorry for my silence. I have a meeting in London on
>>>>>>>>>>>>>>>> Tuesday and
>>>>>>>>>>>>>>>> attended CIUK yesterday. Just back and I have to tidy up
>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>> spreadsheets from Tuesday's meeting and I will be busy
>>>>>>>>>>>>>>>> today as
>>>>>>>>>>>>>>>> well
>>>>>>>>>>>>>>>> with local tasks. So I should get back to this tomorrow.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best wishes,
>>>>>>>>>>>>>>>> Lydia
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>  On Wed, 9 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Here is the proposal the third option. Would also be worth
>>>>>>>>>>>>>>>>> looking
>>>>>>>>>>>>>>>>> into.
>>>>>>>>>>>>>>>>> It is written in python AFAIK.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Overall we are trying to deploy something that meets the
>>>>>>>>>>>>>>>>> requirements
>>>>>>>>>>>>>>>>> and saves us time in the long run.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>> --jens
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>
>

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

April 2024
October 2023
March 2023
February 2023
June 2022
May 2022
January 2022
September 2018
February 2018
November 2017
September 2017
August 2017
July 2017
June 2017
March 2017
February 2017
January 2017
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager