Adeel,
Failure on replica management tests, when not related to local problems,
are generally related to top level BDII problems. Clearly there seems to be
an error in your case.
Michel
--On vendredi 30 mars 2007 11:27 +0500 Adeel-ur-Rehman <[log in to unmask]>
wrote:
> Hi All,
>
> We are facing following critical problems at our site for a long time:
>
> 1) Most of the jobs running at our site fails while performing Replica
> Management Tests. The error returned is: LFC endpoint not found
> LFC endpoint not found
> lcg_cr: Invalid argument
>
> I found some help regarding it on the web where it was suggested that to
> resolve this error, one must set the LFC_HOST variable. (e.g., export
> LFC_HOST=prod-lfc-shared-central.cern.ch) but we are not using any LFC at
> our side.
>
> Any Idea about this issue?
>
>
>
> 2) Related to our top level BDII:
>
> pcncp24.ncp.edu.pk: could not be queried
> check for missing attributes in bdii:
> GlueSEUniqueID: lxn1183.cern.ch
> GlueSEName: CERN-PROD:disk
> GlueSARoot: ops:ops
>
> The recommended query for testing it is:
>
> ldapsearch -xLLL -l 15 -h bdiihostname -p 2170 -b
> 'GlueSEUniqueID=lxn1183.cern.ch,mds-vo-
> name=CERN-PROD,mds-vo-name=local,o=grid'
> '(|(GlueSEUniqueID=lxn1183.cern.ch)(objectclass=GlueSA))' GlueSEUniqueID
> GlueSEName GlueSARoot
>
>
> Should I run the above query as it is? as I attempted it several times,
> but it returns: ldap_bind: Can't contact LDAP server
>
> if I mention bdii-host-name-value in place of bdiihostname, i get:
>
> No such object (32)
> Matched DN: mds-vo-name=local,o=grid
>
>
> Are we missing something in it?
>
>
>
> 3) Now-a-days, almost all of our jobs run on the same Worker Node.
> However, all of our 14 Worker Nodes have same H/W specs. No extra disk
> usage problems are there. Moreover, "pbsnodes -a" command gives status of
> every other node as "free".
>
> But sometimes, we found that the jobs running on the single Worker Node
> seem to consume almost 100% CPU usage.
>
> If we power off that WN, almost no jobs come into the queue for
> execution; if reached to the queue, it appears to be in the wait state.
>
> As far as we noticed, there seems to be no differences in the
> configuration files in that WN as compared to any other WN at our site.
>
>
>
> Any solutions are welcome!!!
>
> Thanks in advance.
>
>
>
> Regards,
>
> Adeel
>
>
>
>
>
>
>
> Adeel-ur-Rehman
> Scientific Officer,
> Advanced Scientific Computing
> National Centre for Physics,
> Quaid-i-Azam University Campus,
> Islamabad.
> Email: [log in to unmask] <mailto:[log in to unmask]>
> Tel: (+92-51) 2601018
> Fax: (+92-51) 9205753
*************************************************************
* Michel Jouvin Email : [log in to unmask] *
* LAL / CNRS Tel : +33 1 64468932 *
* B.P. 34 Fax : +33 1 69079404 *
* 91898 Orsay Cedex *
* France *
*************************************************************
|