Hi Condor folks!
I.) I think that the YAIM configuration of the CE
errorneously configures condor/lcgcondor jobmanagers.
The file /opt/globus/etc/grid-services/jobmanager-condor/lcgcondor
should contain the line:
stderr_log,local_cred - /opt/globus/libexec/globus-job-manager
globus-job-manager -conf /opt/globus/etc/globus-job-manager.conf -type
lcgcondor -rdn jobmanager-lcgcondor -machine-type unknown -publish-jobs
-condor-arch intel -condor-os linux
with the
-condor-arch intel -condor-os linux
parameters included, which is not the case after YAIM configuration.
The log (/home/dteam007/gram_job_mgr_18600.log) says:
4/8 02:03:00 JMI: Condor_arch must be specified when jobmanager type is
condor
4/8 02:03:00 Job Manager State Machine (entering):
GLOBUS_GRAM_JOB_MANAGER_STATE_FAILED_DONE
4/8 02:03:00 JM: in globus_gram_job_manager_reporting_file_remove()
4/8 02:03:00 JM: exiting globus_gram_job_manager.
II.) It seems that the jobmanager-condor requires shared home area or an
independent mechanism
for the input/output file transfer. StarterLog on the WN says:
4/8 02:33:38 Using config file: /opt/condor-6.6.8/etc/condor_config
4/8 02:33:38 Using local config files:
/opt/condor-6.6.8/local.grid101/condor_config.local
4/8 02:33:38 DaemonCore: Command Socket at <148.6.8.101:33483>
4/8 02:33:38 Done setting resource limits
4/8 02:33:38 Starter communicating with condor_shadow <148.6.8.109:50035>
4/8 02:33:38 Submitting machine is "grid109.kfki.hu"
4/8 02:33:38 Starting a VANILLA universe job with ID: 3942.0
4/8 02:33:38 IWD: /home/dteam007/gram_scratch_ZrviDLz3OQ
4/8 02:33:38 Failed to open standard output file
'/home/dteam007/.globus/.gass_cache/local/md5/6b/5bfeb072220a27938ea5bf0703914f/md5/7a/34aae1d0e17a956edf1cc8480426$
4/8 02:33:38 Output file:
/home/dteam007/.globus/.gass_cache/local/md5/6b/5bfeb072220a27938ea5bf0703914f/md5/7a/34aae1d0e17a956edf1cc8480426ba/data
4/8 02:33:38 Failed to open standard error file
'/home/dteam007/.globus/.gass_cache/local/md5/6b/5bfeb072220a27938ea5bf0703914f/md5/15/2bdc0773ab199c4af5268543171ce$
4/8 02:33:38 Error file:
/home/dteam007/.globus/.gass_cache/local/md5/6b/5bfeb072220a27938ea5bf0703914f/md5/15/2bdc0773ab199c4af5268543171ce1/data
4/8 02:33:38 Failed to open some/all of the std files...
4/8 02:33:38 Aborting OsProc::StartJob.
4/8 02:33:38 Failed to start job, exiting
4/8 02:33:38 ShutdownFast all jobs.
Changing to jobmanager-lcgcondor works fine without any sharing.
regards,
Gergely
|