It's very strange that there is nothing in the srmv1 logs. This is your
head node, right? I can telnet to port 8443, but can you check that the
srmv1 server is running. In fact, just restart it. Similarly for the
gridftp server.
Greig
On 03/01/08 13:22, Simon George wrote:
> This is what I found in the logs on se1.
>
> /var/log/dpm/log
>
> roughly every minute, this is repeated:
>
> 01/03 13:12:15 3547,24 dpm_srv_getpoolfs: DP092 - getpoolfs request by
> [log in to unmask]
> (0,0) from se1.pp.rhul.ac.uk
> 01/03 13:12:15 3547,24 dpm_srv_getpoolfs: returns 0
>
> /var/log/srmv1/log
> is empty
>
> /var/log/dpns/log
> has entries for my recent dpns-ls and that's it.
>
> On gridraid2:
> /var/log/dpm-gsiftp/dpm-gsiftp.log is empty.
>
> Anywhere else I should look?
>
> Cheers,
> Simon
>
> Greig Alan Cowan wrote:
>> Something still isn't right, I can't copy a file into your DPM. What
>> are the /var/log/dpm, dpns and srmv1 logs saying?
>>
>> Greig
>>
>> On 03/01/08 12:59, Simon George wrote:
>>> Yes, I have disabled the pool with the expired cert.
>>>
>>> On the head node (se1) I can do:
>>> > dpns-ls /dpm
>>> pp.rhul.ac.uk
>>>
>>> dpm-qryconf and dpm-modifyfs also work.
>>>
>>> Greig Alan Cowan wrote:
>>>> Hi Simon,
>>>>
>>>> Have you made the change yet? Something still isn't right. Are you
>>>> sure that everything is OK on the DPM head node?
>>>>
>>>> Can you run commands like
>>>>
>>>> dpns-ls /dpm
>>>>
>>>> on it as the root user?
>>>>
>>>> Cheers,
>>>> Greig
>>>>
>>>> On 03/01/08 12:14, Simon George wrote:
>>>>> btw the pool node with the out of date cert is gridraid3 which is
>>>>> currently read-only. So the one used by the SAM test should be
>>>>> gridraid2 which does have an up-to-date cert.
>>>>>
>>>>> Do you think this could still cause the error?
>>>>>
>>>>> I wonder if I should completely disable gridraid3 until the cert is
>>>>> fixed.
>>>>>
>>>>> Greig Alan Cowan wrote:
>>>>>> Hi Simon,
>>>>>>
>>>>>> Best thing to do is re-run the YAIM configuration step.
>>>>>>
>>>>>> /opt/glite/yaim/bin/yaim -c -s /path/to/site-info.def -n SE_dpm_mysql
>>>>>>
>>>>>> That should do the trick and make sure all the certificates are in
>>>>>> the right place with the right permissions etc.
>>>>>>
>>>>>> You should also think about upgrading to the latest version of DPM
>>>>>> (1.6.7). More about that in a forthcoming email...
>>>>>>
>>>>>> Cheers,
>>>>>> Greig
>>>>>>
>>>>>> On 03/01/08 11:49, Simon George wrote:
>>>>>>> Hi Greig,
>>>>>>>
>>>>>>> thanks for your quick reply.
>>>>>>> I've checked, all the CRLs are up to date. The host certificates
>>>>>>> are up to date as in:
>>>>>>>
>>>>>>> openssl x509 -in /etc/grid-security/hostcert.pem -dates
>>>>>>> -text|head -2
>>>>>>> notBefore=Oct 5 11:25:32 2007 GMT
>>>>>>> notAfter=Nov 3 11:25:32 2008 GMT
>>>>>>>
>>>>>>> But I've noticed that on the pool node, this certificate is not
>>>>>>> propagated to /etc/grid-security/dpmmgr/dpmcert.pem
>>>>>>> (nor the corresponding key file). The files there are old and
>>>>>>> have expired.
>>>>>>>
>>>>>>> Is it just a case of copying these new certificate files to
>>>>>>> dpmmgr/dpmcert.* or do I need to do something else too?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Simon
>>>>>>>
>>>>>>>
>>>>>>> Greig Alan Cowan wrote:
>>>>>>>> Hi Simon,
>>>>>>>>
>>>>>>>> It looks like a security issue. Are the certificates and
>>>>>>>> CRLs up to date?
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Greig
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 03/01/08 10:44, Simon George wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I have the problem described here:
>>>>>>>>> http://www.gridpp.ac.uk/wiki/Random_DPM_errors_in_SAM#Error_reading_token_data
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Does anyone know the solution to this?
>>>>>>>>>
>>>>>>>>> For example:
>>>>>>>>> https://lcg-sam.cern.ch:8443/sam/sam.py?funct=TestResult&nodename=ce1.pp.rhul.ac.uk&vo=ops&testname=CE-sft-lcg-rm-cr&testtimestamp=1199352207
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Simon
|