JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for DIRAC-USERS Archives


DIRAC-USERS Archives

DIRAC-USERS Archives


DIRAC-USERS@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

DIRAC-USERS Home

DIRAC-USERS Home

DIRAC-USERS  November 2015

DIRAC-USERS November 2015

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Re: FW: Rethinking backups?

From:

Jens Jensen <[log in to unmask]>

Reply-To:

Jens Jensen <[log in to unmask]>

Date:

Fri, 27 Nov 2015 15:02:06 +0000

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (319 lines)

One approach we are using with StorageD is to dump files into a specific
directory and an ingest daemon looks at it regularly and moves it (and,
I think, deletes it from said directory).

Again one could write a simple cron job which looks at a backup
directory and schedules a transfer of the file to RAL. Of course you
don't want to delete them till they have been successfully transferred,
so maybe another daemon (well, cron job) could do that. None of them
would have to be more than a few lines of script.

So the scheduler will need to check that it hasn't already scheduled a
transfer for the file; maybe the simplest is to move to a new directory
and then schedule the transfer on the new location.

The deletion cronjob would need to use either the WebDAV interface to
check, or it would rely on the system file dump. However, the frequency
of the file system dump is daily at best so WebDAV would be a better way
of checking.

Cheers
--jens

On 27/11/2015 14:33, Samuel Skipsey wrote:
> hi Lydia,
>
> So:
>
> One of the major issues with the whole archiving process in general is, of course, that you're not allowed to lock directories when they're being archived (which means that the state of the archive might not be consistent, as the files might change while you're sweeping the tree).
>
> I think it's possible to augment the new-volume helper script to do what you want (essentially, the script is allowed to do whatever is necessary to get ready for the next segment of the archive to be written, so performing a file copy is allowed). However: it would probably be best to do this somewhat asynchronously, and not all in this script. (A consumer which runs and attempts to upload subarchives as they are completed would seem to be the best approach here.)
>
> At present, as far as I know, there's not really that much parallelism in the file transfers you're doing anyway, so it's not clear to me that the tar will be horribly bad (at least it's working on a local filesystem!). But, yes, it will inevitably take some time to complete on very large directories, and more if you're also compressing them with an approach similar to my demonstrator.
>
> Sam
> ________________________________________
> From: Lydia Heck [[log in to unmask]]
> Sent: 27 November 2015 12:57
> To: Samuel Skipsey
> Cc: Lydia Heck; [log in to unmask]
> Subject: RE: FW: Rethinking backups?
>
> Hi Sam,
>
> I had started this email earlier this morning, but then got side-tracked by
> other issues. So some of it might be a re-hash of what Jens wrote earlier.
>
>
> You are right. I had not taken into account the script.
>
> However on my test system which is running CentOS 6.7 and tar-1.23-13 I thought
> I had to use -M. This happened because I had not put the '-' in front of the t
> when trying to list the content of one archive. But I cannot untar each
> individual section of the multi-volume tar. When giving the command
>
> tar -t -F ./my_archive -f ANJA.tar
>
> it will read all the subvolumes but trying to untar a subvolume will not work:
>
> tar -xf ANJA.tar-3
> tar: ./ANJA/Anja_Slim_inventory_and_insurance_form.pdf: Cannot extract -- file
> is continued from another volume
> tar: Skipping to next header
> tar: Exiting with failure status due to previous errors
>
> But overall this has definite possibilities!
>
> The things that have to be considered are:
>
> (a) there is not enough space on our storage to keep all the sub-tars of one
> such tar session. Even if I did not start an `all' approach, starting at the
> toplevel and travelling up each directory tree, but concentrated on individual
> projects I would have for several projects a tar archive exceending the current
> storage capacity.
>
> (b) the script that allows the creation of the subvolumes needs to be able to
> submit a subvolume, check that the transfer is complete, start the creation of
> the next volume and submit and delete the previous one and so on! The archiving
> request to RAL should be with checksum, to make sure that the transfer is
> complete and correct before deleting the previous.
>
> (c) There are project directories of one set of contiguous data that could be as
> big as 200 TB. The tar will take some time as there is no parallelism and there
> is only one
>
> I had some more, but I will send this, rather than being side-tracked again.
>
> Lydia
>
>
>
>
>
>   On Thu, 26 Nov 2015, Samuel Skipsey wrote:
>
>> Hi Lydia,
>>
>> I just did my own testing, and it works as advertised - except for compression not working, and the difference that the size of tar files needs to be specified in KILOBYTES (this is a difference to the documentation).
>> You could get around the compression issue by doing the gzip in the new-volume script.
>>
>> Example from my testing:
>> svr003:~/scs/test# cd dir
>> svr003:~/scs/test/dir# ls
>> test1  test2  test3  test4
>> svr003:~/scs/test/dir# ls -l
>> total 1584
>> -rw-r--r--. 1 root root 400000 Nov 26 20:12 test1
>> -rw-r--r--. 1 root root 400000 Nov 26 20:12 test2
>> -rw-r--r--. 1 root root 400000 Nov 26 20:12 test3
>> -rw-r--r--. 1 root root 400000 Nov 26 20:12 test4
>> svr003:~/scs/test/dir# cd ..
>> svr003:~/scs/test# cat new-volume
>> #!/bin/bash
>>
>> name=`expr $TAR_ARCHIVE : '\(.*\)-.*'`
>> case $TAR_SUBCOMMAND in
>> -c)   ;;
>> -d|-x|-t) test -r ${name:-$TAR_ARCHIVE}-$TAR_VOLUME || exit 1
>>       ;;
>> *)    exit 1
>> esac
>>
>> echo ${name:-$TAR_ARCHIVE}-$TAR_VOLUME >&$TAR_FD
>> svr003:~/scs/test# tar -c -L 600 -F ./new-volume -f archive.tar dir/
>> svr003:~/scs/test# ls
>> archive.tar  archive.tar-2  archive.tar-3  dir  -.gz  new-volume  new-volume2
>> svr003:~/scs/test# ls -l
>> total 1596
>> -rw-r--r--. 1 root root 614400 Nov 26 20:13 archive.tar
>> -rw-r--r--. 1 root root 614400 Nov 26 20:13 archive.tar-2
>> -rw-r--r--. 1 root root 378880 Nov 26 20:13 archive.tar-3
>> drwxr-xr-x. 2 root root   4096 Nov 26 20:12 dir
>> -rw-r--r--. 1 root root      0 Nov 26 20:12 -.gz
>> -rwxr-xr-x. 1 root root    215 Nov 26 20:12 new-volume
>> -rwxr-xr-x. 1 root root    287 Nov 26 20:12 new-volume2
>>
>> Sam
>> ________________________________________
>> From: Lydia Heck [[log in to unmask]]
>> Sent: 26 November 2015 17:47
>> To: Samuel Skipsey
>> Cc: [log in to unmask]
>> Subject: Re: FW: Rethinking backups?
>>
>> I have just testing the command and I get that
>>
>>  tar -c -z -L 51024 -F new-volume -f ANJA.backup.tgz ./ANJA
>> tar: Cannot use multi-volume compressed archives
>> Try `tar --help' or `tar --usage' for more information
>>
>> So compression cannot be used with these instructions.
>>
>> Once I remove the compression flag it will not do `multivolume' but just
>> archives everything to the same archive file, even when I use the -M flag.
>>
>> So I think that this might not be the answer, as we are not dealing with tapes.
>>
>> But I keep looking.
>>
>> Lydia
>>
>>
>>
>>
>>
>> On Thu, 26 Nov 2015, Samuel Skipsey wrote:
>>
>>> Whoops, replied to the wrong thing (just Jens!)
>>>
>>> As notes to the email below, though, there's a typo I wrote, and the command should be:
>>>
>>> tar -c -z -L SIZEINBYTES -F new-volume -f dir.backup.tgz dir/
>>>
>>> (this works with GNU tar only)
>>>
>>> If you're sticking with GNU tar, you can also do incremental backups with it (it needs to maintain an additional "snapshot" file, which keeps track of what the modification date was for each file when it was last archived, which you would also need to keep safe on Tape).
>>>
>>> Sam
>>> ________________________________________
>>> From: Samuel Skipsey
>>> Sent: 26 November 2015 16:25
>>> To: Jensen, Jens (STFC,RAL,SC)
>>> Subject: RE: Rethinking backups?
>>>
>>> To Follow up on this (and my similar objection), tar already supported splitting into multiple tar files on file length (as you might expect, as it was originally written to write to tape archives, where tapes are a pre-specified size).
>>>
>>> The approach is:
>>>
>>> tar czf -L SIZEINBYTES -F new-volume -f dir.backup.tgz dir/
>>>
>>> where new-volume is a script which writes to the environment variable / file descriptor "$TAR_FD", which controls the name of the file that tar will write to. new-volume is called each time the current tar file being written reaches SIZEINBYTES, and should perform whatever tasks need doing to make the next file available. (On tape machines, new-volume would actually change the tapes!)
>>>
>>> The example for new-volume for writing to files from the gnu docs is:
>>>
>>> #! /bin/bash
>>> # For this script it's advisable to use a shell, such as Bash,
>>> # that supports a TAR_FD value greater than 9.
>>>
>>> echo Preparing volume $TAR_VOLUME of $TAR_ARCHIVE.
>>>
>>> name=`expr $TAR_ARCHIVE : '\(.*\)-.*'`
>>> case $TAR_SUBCOMMAND in
>>> -c)       ;;
>>> -d|-x|-t) test -r ${name:-$TAR_ARCHIVE}-$TAR_VOLUME || exit 1
>>>          ;;
>>> *)        exit 1
>>> esac
>>>
>>> echo ${name:-$TAR_ARCHIVE}-$TAR_VOLUME >&$TAR_FD
>>>
>>> which will append an ordinal number to each archive volume created.
>>>
>>> Sam
>>> ________________________________________
>>> From: DiRAC Users [[log in to unmask]] on behalf of Jensen, Jens (STFC,RAL,SC) [[log in to unmask]]
>>> Sent: 26 November 2015 16:08
>>> To: [log in to unmask]
>>> Subject: Re: Rethinking backups?
>>>
>>> Hi Lydia,
>>>
>>> Simple and attractive though it seems, I'd be loath to split a .tgz file
>>> - I'd much prefer that each individual file is restorable by itself and
>>> your .aa, .ab, .ac files wouldn't be. You'd have to glue all the files
>>> together before you can extract a single file from the archive (at least
>>> in principle) and the pieces could end up on different tapes (in principle).
>>>
>>> This approach would be safer if you compress after splitting but I'd
>>> still prefer to have individual tar files which can then be compressed.
>>>
>>> I did think of the restart mechanism. This is what I do with my files at
>>> home where I need to be able to interrupt the (as it is there,
>>> checksumming) process because the system is not on 24/7. The easiest
>>> thing would be to rebuild the tar, though.
>>>
>>> Thanks
>>> --jens
>>>
>>> On 26/11/2015 14:41, Lydia Heck wrote:
>>>> Hi Jens,
>>>>
>>>> I found a prescription to do the tar'ing and splitting using
>>>>
>>>> tar czf - dir/ | split --bytes=number-of-bytes - dir.backup.tar.gz.
>>>>
>>>> This will create a gzip'ed tar archive and splits it at
>>>> number-of-bytes boundaries and saves these parts in
>>>> dir.backup.tar.gz.aa, dir.backup.tar.gz.ab ....
>>>>
>>>> The next problem is to submit such an part to the archiving system and
>>>> then delete the part.
>>>>
>>>> This action would have to run in parallel to the archiving action
>>>> itself and only on successful submission transmisison would the file
>>>> be deleted.
>>>>
>>>> There is one big issue with this: if the tar was disrupted, we would
>>>> really have to trust the system to traverse the directory structure in
>>>> the same way when restarting the tar - from scratch, but only archive
>>>> the last file segment by force and then keep on going.
>>>>
>>>> The archive would have to be done from a snapshot. For Durham there is
>>>> only a global snapshot possible as we are not using specific filesets.
>>>> All do-able. I have not created a snapshot on the filesystem so far
>>>> and I do not know how large such a snapshot would be. I have to look
>>>> at the size of the metadata partition to make sure that we have enough
>>>> space.
>>>>
>>>> There is no straightforward dump command for the filesystem.
>>>>
>>>> Best wishes,
>>>> Lydia
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, 26 Nov 2015, Jensen, Jens (STFC,RAL,SC) wrote:
>>>>
>>>>> StorageD is a tool we use to manage data from Diamond Light Source for
>>>>> example (www.diamond.ac.uk); it basically concatenates all the small
>>>>> files (that would not be my choice of technology but this is what SRB
>>>>> used to do with its "collections" and I guess this is why StorageD went
>>>>> the same way) and relies on metadata to remember the indices of files in
>>>>> each aggregation. Files are dropped into a known location (but we'd use
>>>>> a modification which doesn't need that) for ingest.
>>>>>
>>>>> However, while this approach would create more manageable file sizes, it
>>>>> still requires someone to manually keep feeding data to the tool. My
>>>>> inclination would be to prefer something we can automate. Either if GPFS
>>>>> has a built in backup tool (à la dump), or something simple like:
>>>>>
>>>>> find /cosma/mountpoint -newer ~/timestamp -print >/tmp/filelist-`date
>>>>> -I`
>>>>>
>>>>> and this could be tuned with -prune or something to ignore .svn maybe.
>>>>>
>>>>> Then loop through creating tar files until they reach a certain size -
>>>>> say 10GB or 100GB, and then send it off; and the last tarball containing
>>>>> whatever is left is sent off to CASTOR, too, doesn't matter if it is
>>>>> small. This could even be written in shell if one was very keen...but
>>>>> I'd do it in Perl. Alternatively the script would have to do the walking
>>>>> through the filesystem which may be more scaleable but would also need a
>>>>> little bit more work.
>>>>>
>>>>> What do you think?
>>>>>
>>>>> Also, the downside is that we may have to start over, with a clean
>>>>> slate... but if we do, the backups should be much quicker as we are
>>>>> transferring much larger files and we do not rely on busy people filling
>>>>> the queue. Humans should not be doing the jobs that a machine can do.
>>>>>
>>>>> Thanks
>>>>> --jens
>>>>>

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

April 2024
October 2023
March 2023
February 2023
June 2022
May 2022
January 2022
September 2018
February 2018
November 2017
September 2017
August 2017
July 2017
June 2017
March 2017
February 2017
January 2017
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager