OK; let me know how you're doing.
Cheers
--jens
On 23/12/2015 18:30, Lydia Heck wrote:
>
> Hi Jens,
>
> I need to add one more functionality to the script then I am ready.
> That will happen tomorrow. Then I will keep on going ....
>
> Lydia
>
>
> On Wed, 23 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote:
>
>> Hi Lydia,
>>
>> Great stuff. So you will be moving data over the Christmas break? This
>> will be good...
>>
>> ... we also need to get Jon started; whether he wants to run your
>> script, too, or do something else. And we need to clear out the old
>> data, but I'd ask Brian to look into that once he's back in the new
>> year.
>>
>> Merry and Happy to you too.
>>
>> Cheers
>> -j
>>
>>
>>
>> On 23/12/2015 15:01, Lydia Heck wrote:
>>> Hi Jens and all,
>>>
>>> I have a script now that tars up directories per DiRAC project into
>>> tar files of specific size. Once one tar file is complete it is
>>> archived to RAL. Once the archive is complete the tar file is deleted
>>> and the next set of files is being archived.
>>>
>>> The chunk size at present is 256 GByte or slightly bigger depends on
>>> the size of files.
>>>
>>> The transfer of such a file takes ~15 minutes.
>>>
>>> The script needs some polishing and once I am totally happy I can run
>>> it non-interactively. I currently still have an interactive element in
>>> the script as the last debugging and other idea stages are not fully
>>> completed.
>>>
>>> Merry Christmas and a Happy New year.
>>>
>>> Lydia
>>>
>>>
>>>
>>>
>>> On Tue, 22 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote:
>>>
>>>> Hooray for downgrading!
>>>>
>>>> On 22/12/2015 13:49, Lydia Heck wrote:
>>>>>
>>>>> Done it. I have down-graded to 3.3.3-x and now the lot works.
>>>>>
>>>>> Lydia
>>>>>
>>>>>
>>>>> On Tue, 22 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote:
>>>>>
>>>>>> Weird! Which version are you using?
>>>>>>
>>>>>> We seem to have fts-rest-3.3.3-2 and fts-rest-cli-3.3.3 and
>>>>>> fts-rest-cloud-storage-3.3.3 and python-fts-3.3.3 but every other
>>>>>> fts
>>>>>> package on the server is 3.3.2. (There is both a python-fts and an
>>>>>> fts-python - weird).
>>>>>>
>>>>>> Cheers
>>>>>> --jens
>>>>>>
>>>>>> On 22/12/2015 12:22, Lydia Heck wrote:
>>>>>>> Hi Jens,
>>>>>>>
>>>>>>> I have a script that I could test. However I now have an issue
>>>>>>> that the
>>>>>>>
>>>>>>> fts-transfer command does not work anymore with the error message
>>>>>>>
>>>>>>>
>>>>>>> fts client is connecting using the gSOAP interface. Consider
>>>>>>> changing
>>>>>>> your configured fts endpoint port to select the REST
>>>>>>> interface
>>>>>>>
>>>>>>> I am currently rebooting the system, but have you seen something
>>>>>>> similar once before?
>>>>>>>
>>>>>>> Best wishes,
>>>>>>> Lydia
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, 18 Dec 2015, Jens Jensen wrote:
>>>>>>>
>>>>>>>> Right, so the find suggestion at least would do a depth first
>>>>>>>> listing of
>>>>>>>> files-to-add, and tar I am guessing would also add files depth
>>>>>>>> first,
>>>>>>>> which I think meets your requirement, or close enough, of putting
>>>>>>>> related files into the same chunk.
>>>>>>>>
>>>>>>>> Using find-and-then-tar you could avoid building the following
>>>>>>>> archive
>>>>>>>> until the current one has been sent off to RAL. You'd just need
>>>>>>>> space
>>>>>>>> for the filelist.
>>>>>>>>
>>>>>>>> What I am thinking is:
>>>>>>>> 1. find <folder to be backed up> -newer <timestamp file> |<list
>>>>>>>> size and
>>>>>>>> full filename> >filelist
>>>>>>>> 2. Walk through filelist one line at a time adding up sizes and
>>>>>>>> filenames till a certain threshold size has been exceeded (say
>>>>>>>> 20GB or
>>>>>>>> 100,000 files, whichever comes firsts) or adding the next file
>>>>>>>> will
>>>>>>>> take
>>>>>>>> us above a higher threshold (say 50GB)
>>>>>>>> 3. Once a list has been found, tar it up, compress it, optionally
>>>>>>>> store
>>>>>>>> the contents (list) somewhere, send the tarball to RAL, and then
>>>>>>>> delete it.
>>>>>>>> 4. Go back to step 2 until the filelist has been completed.
>>>>>>>> 5. Touch the timestamp file
>>>>>>>> 6. sleep 24 hours (or whatever) and go to step 1.
>>>>>>>>
>>>>>>>> This would meet all our requirements and would be stupidly easy
>>>>>>>> to do.
>>>>>>>>
>>>>>>>> Cheers
>>>>>>>> --jens
>>>>>>>>
>>>>>>>>
>>>>>>>> On 17/12/2015 12:43, Lydia Heck wrote:
>>>>>>>>>
>>>>>>>>> Hi Jens,
>>>>>>>>>
>>>>>>>>> it took longer than I thought to tidy up the results from the
>>>>>>>>> meeting
>>>>>>>>> last week (I spent a full day on a spreadsheet :-) )
>>>>>>>>>
>>>>>>>>> However I am now going to look at the transfers again.
>>>>>>>>>
>>>>>>>>> I looked over the presentation you shared with us. And yes,
>>>>>>>>> that is
>>>>>>>>> the way it should go. There are some provisos:
>>>>>>>>>
>>>>>>>>> If I create 3 TB chunks, I need to have space for several of
>>>>>>>>> them:
>>>>>>>>>
>>>>>>>>> One being transfered, one in waiting and one being prepared. This
>>>>>>>>> will
>>>>>>>>> add 10 TB to the storage that is not available for the users;
>>>>>>>>> can be
>>>>>>>>> done, but needs to be factored in.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> If there is indeed a failure, then I need to identify where the
>>>>>>>>> data
>>>>>>>>> are that have been deleted, corrupted or whatever. If I "just"
>>>>>>>>> chunk
>>>>>>>>> the whole filesystem, that would be difficult, if not
>>>>>>>>> impossible to
>>>>>>>>> find. So I would need to arrange transfers by project, and even
>>>>>>>>> then
>>>>>>>>> the retrieval might physically not be possible, depending of how
>>>>>>>>> many
>>>>>>>>> of the chunks I would have to retrieve.
>>>>>>>>>
>>>>>>>>> I believe that currently the biggest top folder is ~500 TB.
>>>>>>>>>
>>>>>>>>> There would not be lots of jobs running, simply because there is
>>>>>>>>> not
>>>>>>>>> enough space to chunk that much.
>>>>>>>>>
>>>>>>>>> On the storage that I would like to archive there are more
>>>>>>>>> than 64M
>>>>>>>>> files.
>>>>>>>>>
>>>>>>>>> So would a "flat" chunking tar of all the filesystem be a "good"
>>>>>>>>> idea? I am not sure.
>>>>>>>>>
>>>>>>>>> I need to think about this a bit more.
>>>>>>>>>
>>>>>>>>> Lydia
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, 10 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote:
>>>>>>>>>
>>>>>>>>>> Hi Lydia,
>>>>>>>>>>
>>>>>>>>>> That's great. I am actually on leave tomorrow (travelling)
>>>>>>>>>> and out
>>>>>>>>>> Monday (at Royal Holloway) but the others on the list can
>>>>>>>>>> follow up.
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> --jens
>>>>>>>>>>
>>>>>>>>>> On 10/12/2015 10:21, Lydia Heck wrote:
>>>>>>>>>>>
>>>>>>>>>>> Dear all,
>>>>>>>>>>>
>>>>>>>>>>> sorry for my silence. I have a meeting in London on Tuesday and
>>>>>>>>>>> attended CIUK yesterday. Just back and I have to tidy up some
>>>>>>>>>>> spreadsheets from Tuesday's meeting and I will be busy today as
>>>>>>>>>>> well
>>>>>>>>>>> with local tasks. So I should get back to this tomorrow.
>>>>>>>>>>>
>>>>>>>>>>> Best wishes,
>>>>>>>>>>> Lydia
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, 9 Dec 2015, Jensen, Jens (STFC,RAL,SC) wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> Here is the proposal the third option. Would also be worth
>>>>>>>>>>>> looking
>>>>>>>>>>>> into.
>>>>>>>>>>>> It is written in python AFAIK.
>>>>>>>>>>>>
>>>>>>>>>>>> Overall we are trying to deploy something that meets the
>>>>>>>>>>>> requirements
>>>>>>>>>>>> and saves us time in the long run.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks
>>>>>>>>>>>> --jens
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>
|