JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for DIRAC-USERS Archives


DIRAC-USERS Archives

DIRAC-USERS Archives


DIRAC-USERS@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

DIRAC-USERS Home

DIRAC-USERS Home

DIRAC-USERS  November 2015

DIRAC-USERS November 2015

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Re: FW: Rethinking backups?

From:

Samuel Skipsey <[log in to unmask]>

Reply-To:

Samuel Skipsey <[log in to unmask]>

Date:

Fri, 27 Nov 2015 12:14:38 +0000

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (198 lines)

hi Jens,

Yeah, I was messing with script parameters!

So, I think I have a working implementation of both compression and incremental backup (which needs some rough-edges filing off, but demonstrates how things work).

Given:

svr003:~/scs/test# ls -l
total 20
-rwxr-xr-x. 1 root root  407 Nov 27 12:01 archive-script
drwxr-xr-x. 2 root root 4096 Nov 26 20:12 dir
-rwxr-xr-x. 1 root root  215 Nov 26 20:12 new-volume
-rwxr-xr-x. 1 root root  227 Nov 27 12:00 new-volume-gz
-rwxr-xr-x. 1 root root  272 Nov 27 11:19 new-volume-gz-2
svr003:~/scs/test# ls -l dir
total 1584
-rw-r--r--. 1 root root 400000 Nov 26 20:12 test1
-rw-r--r--. 1 root root 400000 Nov 26 20:12 test2
-rw-r--r--. 1 root root 400000 Nov 26 20:12 test3
-rw-r--r--. 1 root root 400000 Nov 26 20:12 test4

we have a minimally effective archive script, which just drives tar to do the hard stuff:
svr003:~/scs/test# cat archive-script 
#!/bin/bash

basename=archive.tar
target=dir/
chunksize=600
snapshotfile=./snapshot.snar

#start tar process, with default starting volume number (and update to volnum)
rm volnum
tar -c -L $chunksize --volno-file=volnum --listed-incremental=$snapshotfile -F ./new-volume-gz -f ${basename}-1 $target

#and zip the last file written (as new-volume-gz is not called when tar exits)
gzip ${basename}-$(<volnum)

-
The new-volume-gz can be made simpler in this instance, to be:
svr003:~/scs/test# cat new-volume-gz
#!/bin/bash

name=`expr $TAR_ARCHIVE : '\(.*\)-.*'`

#gzip last archive file
last=-$((TAR_VOLUME-1))
gzip ${name:-$TAR_ARCHIVE}$last &

#and echo the new name to the tar process
echo ${name:-$TAR_ARCHIVE}-$TAR_VOLUME >&$TAR_FD

-
Note that this only supports creation of the files (it's a relatively straightforward process to make a version which will also ungzip them on the fly for extraction by tar), as a demonstrator.

Running the archive-script in this mode generates a series of compressed tar files, segmented as before (and individually addressable, for files which are contained wholly in one tar file).

It also generates a snapshot.snar, which stores the metadata for this archive run.

So:
svr003:~/scs/test# ./archive-script 
rm: cannot remove `volnum': No such file or directory
svr003:~/scs/test# ls -l
total 40
-rwxr-xr-x. 1 root root  407 Nov 27 12:01 archive-script
-rw-r--r--. 1 root root  836 Nov 27 12:08 archive.tar-1.gz
-rw-r--r--. 1 root root  822 Nov 27 12:08 archive.tar-2.gz
-rw-r--r--. 1 root root  460 Nov 27 12:08 archive.tar-3.gz
drwxr-xr-x. 2 root root 4096 Nov 26 20:12 dir
-rwxr-xr-x. 1 root root  215 Nov 26 20:12 new-volume
-rwxr-xr-x. 1 root root  227 Nov 27 12:00 new-volume-gz
-rwxr-xr-x. 1 root root  272 Nov 27 11:19 new-volume-gz-2
-rw-r--r--. 1 root root   97 Nov 27 12:08 snapshot.snar
-rw-r--r--. 1 root root    2 Nov 27 12:08 volnum

So, now, if we delete the archive.tar segments, but leave the snapshot.snar record in place, and rerun the archive script:

svr003:~/scs/test# rm archive.tar*
svr003:~/scs/test# ./archive-script 
svr003:~/scs/test# ls -l
total 32
-rwxr-xr-x. 1 root root  407 Nov 27 12:01 archive-script
-rw-r--r--. 1 root root  148 Nov 27 12:09 archive.tar-1.gz
drwxr-xr-x. 2 root root 4096 Nov 26 20:12 dir
-rwxr-xr-x. 1 root root  215 Nov 26 20:12 new-volume
-rwxr-xr-x. 1 root root  227 Nov 27 12:00 new-volume-gz
-rwxr-xr-x. 1 root root  272 Nov 27 11:19 new-volume-gz-2
-rw-r--r--. 1 root root   97 Nov 27 12:09 snapshot.snar
-rw-r--r--. 1 root root    2 Nov 27 12:09 volnum

We see that there's only one, v small archive made this time (as there were no changes between this archive and the last one). The snapshot.snar is *updated* to contain the new status of files *if any files were written to the new archive*.
So, this archive is a delta against the first one.

If we now touch a file, and run the script again:

svr003:~/scs/test# rm archive.tar-1.gz 
svr003:~/scs/test# cd dir
svr003:~/scs/test/dir# touch test1
svr003:~/scs/test/dir# cd ..
svr003:~/scs/test# ./archive-script 
svr003:~/scs/test# ls -l
total 32
-rwxr-xr-x. 1 root root  407 Nov 27 12:01 archive-script
-rw-r--r--. 1 root root  589 Nov 27 12:11 archive.tar-1.gz
drwxr-xr-x. 2 root root 4096 Nov 26 20:12 dir
-rwxr-xr-x. 1 root root  215 Nov 26 20:12 new-volume
-rwxr-xr-x. 1 root root  227 Nov 27 12:00 new-volume-gz
-rwxr-xr-x. 1 root root  272 Nov 27 11:19 new-volume-gz-2
-rw-r--r--. 1 root root   97 Nov 27 12:11 snapshot.snar
-rw-r--r--. 1 root root    2 Nov 27 12:11 volnum

The archive is still small, but it's larger than before - it now contains just the touched file, as its modification time is different to that recorded in the snapshot.snar file.


Note that the documentation on GNU Tar here: http://www.gnu.org/software/tar/manual/html_node/Multi_002dVolume-Archives.html 
is very useful (and explicitly tells you how to extract files from multi-volume archives in the cases where the file is a) wholly in a single part of the archive, b) spread across several). The instructions are essentially what I wrote in a previous email in this conversation.

Hope this is helpful,

Sam
________________________________________
From: Jensen, Jens (STFC,RAL,SC) [[log in to unmask]]
Sent: 27 November 2015 10:51
To: Samuel Skipsey; [log in to unmask]
Subject: Re: FW: Rethinking backups?

Hi Sam,

Thanks for investigating this. The "-.gz" is an accident from a previous
experiment, I presume.

The compression isn't hugely important unless it'll make the transfers
go faster - maybe worth checking what the compression ratio will be in
practice - because data is compressed when it goes to tape. So for less
than, say, 10%, I wouldn't worry.

My docs (aka the man page) say kB:

    -L, --tape-length NUMBER
           change tape after writing NUMBER x 1024 bytes

However, the archives are not necessarily individually extractable - let
us do a bit more testing:

jensen@ganesha[2]35% tar tvf archive.tar-5
M--------- 0/0            1536 1970-01-01 01:00 test/foo66--Continued at
byte 8704--
-rw-r--r-- jensen/esc    10240 2015-11-27 10:38 test/foo32
-rw-r--r-- jensen/esc    10240 2015-11-27 10:38 test/foo97
-rw-r--r-- jensen/esc    10240 2015-11-27 10:38 test/foo90
-rw-r--r-- jensen/esc    10240 2015-11-27 10:38 test/foo87
-rw-r--r-- jensen/esc    10240 2015-11-27 10:38 test/foo48

jensen@ganesha[2]36% rm -r test
jensen@ganesha[2]37% tar xf archive.tar-5 test/foo90
tar: test/foo66: Cannot extract -- file is continued from another volume
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
[1]    24337 exit 2     tar xf archive.tar-5 test/foo90

... It complains, but it /does /extract the file! So let's try foo66
which spans archive.tar-4 and archive.tar-5

jensen@ganesha[2]51% tar xf archive.tar-4 test/foo66
tar: test/foo100: Cannot extract -- file is continued from another volume
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
[1]    24516 exit 2     tar xf archive.tar-4 test/foo66

jensen@ganesha[2]52% ls -l test
total 24
-rw-r--r-- 1 jensen esc  8704 Nov 27 10:46 foo66
-rw-r--r-- 1 jensen esc 10240 Nov 27 10:38 foo90

jensen@ganesha[2]53% mv test/foo66 test/foo66.tmp
jensen@ganesha[2]54% tar xf archive.tar-5 test/foo66
tar: test/foo66: Cannot extract -- file is continued from another volume
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
[1]    24541 exit 2     tar xf archive.tar-5 test/foo66

... but this actually doesn't do anything; it doesn't extract the
remaining fragment of foo66 :-(

Of corse this would work:

jensen@ganesha[2]55% tar tMfF archive.tar ~/new-volume test/foo66
test/foo66

So this approach may still not be the right one; you generally still
have to recover all the pieces rather than just the one with your stuff
on it.

Cheers
--jens

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

April 2024
October 2023
March 2023
February 2023
June 2022
May 2022
January 2022
September 2018
February 2018
November 2017
September 2017
August 2017
July 2017
June 2017
March 2017
February 2017
January 2017
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager