good - this'll be useful for the next site (ie Leicester, who should be
about ready to go...)
Cheers
-j
On 24/12/2015 14:02, Lydia Heck wrote:
> Sorry forgot some important information:
>
> As I am doing the tar as root on the file server, all ownership, time
> stamps etc
> are fully conserved.
>
> Lydia
>
>
> On Thu, 24 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote:
>
>> OK; let me know how you're doing.
>>
>> Cheers
>> --jens
>>
>> On 23/12/2015 18:30, Lydia Heck wrote:
>>>
>>> Hi Jens,
>>>
>>> I need to add one more functionality to the script then I am ready.
>>> That will happen tomorrow. Then I will keep on going ....
>>>
>>> Lydia
>>>
>>>
>>> On Wed, 23 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote:
>>>
>>>> Hi Lydia,
>>>>
>>>> Great stuff. So you will be moving data over the Christmas break? This
>>>> will be good...
>>>>
>>>> ... we also need to get Jon started; whether he wants to run your
>>>> script, too, or do something else. And we need to clear out the old
>>>> data, but I'd ask Brian to look into that once he's back in the new
>>>> year.
>>>>
>>>> Merry and Happy to you too.
>>>>
>>>> Cheers
>>>> -j
>>>>
>>>>
>>>>
>>>> On 23/12/2015 15:01, Lydia Heck wrote:
>>>>> Hi Jens and all,
>>>>>
>>>>> I have a script now that tars up directories per DiRAC project into
>>>>> tar files of specific size. Once one tar file is complete it is
>>>>> archived to RAL. Once the archive is complete the tar file is deleted
>>>>> and the next set of files is being archived.
>>>>>
>>>>> The chunk size at present is 256 GByte or slightly bigger depends on
>>>>> the size of files.
>>>>>
>>>>> The transfer of such a file takes ~15 minutes.
>>>>>
>>>>> The script needs some polishing and once I am totally happy I can run
>>>>> it non-interactively. I currently still have an interactive
>>>>> element in
>>>>> the script as the last debugging and other idea stages are not fully
>>>>> completed.
>>>>>
>>>>> Merry Christmas and a Happy New year.
>>>>>
>>>>> Lydia
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, 22 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote:
>>>>>
>>>>>> Hooray for downgrading!
>>>>>>
>>>>>> On 22/12/2015 13:49, Lydia Heck wrote:
>>>>>>>
>>>>>>> Done it. I have down-graded to 3.3.3-x and now the lot works.
>>>>>>>
>>>>>>> Lydia
>>>>>>>
>>>>>>>
>>>>>>> On Tue, 22 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote:
>>>>>>>
>>>>>>>> Weird! Which version are you using?
>>>>>>>>
>>>>>>>> We seem to have fts-rest-3.3.3-2 and fts-rest-cli-3.3.3 and
>>>>>>>> fts-rest-cloud-storage-3.3.3 and python-fts-3.3.3 but every other
>>>>>>>> fts
>>>>>>>> package on the server is 3.3.2. (There is both a python-fts and an
>>>>>>>> fts-python - weird).
>>>>>>>>
>>>>>>>> Cheers
>>>>>>>> --jens
>>>>>>>>
>>>>>>>> On 22/12/2015 12:22, Lydia Heck wrote:
>>>>>>>>> Hi Jens,
>>>>>>>>>
>>>>>>>>> I have a script that I could test. However I now have an issue
>>>>>>>>> that the
>>>>>>>>>
>>>>>>>>> fts-transfer command does not work anymore with the error message
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> fts client is connecting using the gSOAP interface. Consider
>>>>>>>>> changing
>>>>>>>>> your configured fts endpoint port to select the REST
>>>>>>>>> interface
>>>>>>>>>
>>>>>>>>> I am currently rebooting the system, but have you seen something
>>>>>>>>> similar once before?
>>>>>>>>>
>>>>>>>>> Best wishes,
>>>>>>>>> Lydia
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, 18 Dec 2015, Jens Jensen wrote:
>>>>>>>>>
>>>>>>>>>> Right, so the find suggestion at least would do a depth first
>>>>>>>>>> listing of
>>>>>>>>>> files-to-add, and tar I am guessing would also add files depth
>>>>>>>>>> first,
>>>>>>>>>> which I think meets your requirement, or close enough, of
>>>>>>>>>> putting
>>>>>>>>>> related files into the same chunk.
>>>>>>>>>>
>>>>>>>>>> Using find-and-then-tar you could avoid building the following
>>>>>>>>>> archive
>>>>>>>>>> until the current one has been sent off to RAL. You'd just need
>>>>>>>>>> space
>>>>>>>>>> for the filelist.
>>>>>>>>>>
>>>>>>>>>> What I am thinking is:
>>>>>>>>>> 1. find <folder to be backed up> -newer <timestamp file> |<list
>>>>>>>>>> size and
>>>>>>>>>> full filename> >filelist
>>>>>>>>>> 2. Walk through filelist one line at a time adding up sizes and
>>>>>>>>>> filenames till a certain threshold size has been exceeded (say
>>>>>>>>>> 20GB or
>>>>>>>>>> 100,000 files, whichever comes firsts) or adding the next file
>>>>>>>>>> will
>>>>>>>>>> take
>>>>>>>>>> us above a higher threshold (say 50GB)
>>>>>>>>>> 3. Once a list has been found, tar it up, compress it,
>>>>>>>>>> optionally
>>>>>>>>>> store
>>>>>>>>>> the contents (list) somewhere, send the tarball to RAL, and then
>>>>>>>>>> delete it.
>>>>>>>>>> 4. Go back to step 2 until the filelist has been completed.
>>>>>>>>>> 5. Touch the timestamp file
>>>>>>>>>> 6. sleep 24 hours (or whatever) and go to step 1.
>>>>>>>>>>
>>>>>>>>>> This would meet all our requirements and would be stupidly easy
>>>>>>>>>> to do.
>>>>>>>>>>
>>>>>>>>>> Cheers
>>>>>>>>>> --jens
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 17/12/2015 12:43, Lydia Heck wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi Jens,
>>>>>>>>>>>
>>>>>>>>>>> it took longer than I thought to tidy up the results from the
>>>>>>>>>>> meeting
>>>>>>>>>>> last week (I spent a full day on a spreadsheet :-) )
>>>>>>>>>>>
>>>>>>>>>>> However I am now going to look at the transfers again.
>>>>>>>>>>>
>>>>>>>>>>> I looked over the presentation you shared with us. And yes,
>>>>>>>>>>> that is
>>>>>>>>>>> the way it should go. There are some provisos:
>>>>>>>>>>>
>>>>>>>>>>> If I create 3 TB chunks, I need to have space for several of
>>>>>>>>>>> them:
>>>>>>>>>>>
>>>>>>>>>>> One being transfered, one in waiting and one being prepared.
>>>>>>>>>>> This
>>>>>>>>>>> will
>>>>>>>>>>> add 10 TB to the storage that is not available for the users;
>>>>>>>>>>> can be
>>>>>>>>>>> done, but needs to be factored in.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> If there is indeed a failure, then I need to identify where the
>>>>>>>>>>> data
>>>>>>>>>>> are that have been deleted, corrupted or whatever. If I "just"
>>>>>>>>>>> chunk
>>>>>>>>>>> the whole filesystem, that would be difficult, if not
>>>>>>>>>>> impossible to
>>>>>>>>>>> find. So I would need to arrange transfers by project, and even
>>>>>>>>>>> then
>>>>>>>>>>> the retrieval might physically not be possible, depending of
>>>>>>>>>>> how
>>>>>>>>>>> many
>>>>>>>>>>> of the chunks I would have to retrieve.
>>>>>>>>>>>
>>>>>>>>>>> I believe that currently the biggest top folder is ~500 TB.
>>>>>>>>>>>
>>>>>>>>>>> There would not be lots of jobs running, simply because
>>>>>>>>>>> there is
>>>>>>>>>>> not
>>>>>>>>>>> enough space to chunk that much.
>>>>>>>>>>>
>>>>>>>>>>> On the storage that I would like to archive there are more
>>>>>>>>>>> than 64M
>>>>>>>>>>> files.
>>>>>>>>>>>
>>>>>>>>>>> So would a "flat" chunking tar of all the filesystem be a
>>>>>>>>>>> "good"
>>>>>>>>>>> idea? I am not sure.
>>>>>>>>>>>
>>>>>>>>>>> I need to think about this a bit more.
>>>>>>>>>>>
>>>>>>>>>>> Lydia
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, 10 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Lydia,
>>>>>>>>>>>>
>>>>>>>>>>>> That's great. I am actually on leave tomorrow (travelling)
>>>>>>>>>>>> and out
>>>>>>>>>>>> Monday (at Royal Holloway) but the others on the list can
>>>>>>>>>>>> follow up.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks
>>>>>>>>>>>> --jens
>>>>>>>>>>>>
>>>>>>>>>>>> On 10/12/2015 10:21, Lydia Heck wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Dear all,
>>>>>>>>>>>>>
>>>>>>>>>>>>> sorry for my silence. I have a meeting in London on
>>>>>>>>>>>>> Tuesday and
>>>>>>>>>>>>> attended CIUK yesterday. Just back and I have to tidy up some
>>>>>>>>>>>>> spreadsheets from Tuesday's meeting and I will be busy
>>>>>>>>>>>>> today as
>>>>>>>>>>>>> well
>>>>>>>>>>>>> with local tasks. So I should get back to this tomorrow.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best wishes,
>>>>>>>>>>>>> Lydia
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, 9 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Here is the proposal the third option. Would also be worth
>>>>>>>>>>>>>> looking
>>>>>>>>>>>>>> into.
>>>>>>>>>>>>>> It is written in python AFAIK.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Overall we are trying to deploy something that meets the
>>>>>>>>>>>>>> requirements
>>>>>>>>>>>>>> and saves us time in the long run.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>> --jens
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>
|