Hi All,
I'm now sure there's some very strange java thing going on with the
latest voms-clients3 on large machines. I get the following error for
Atlas pilot jobs as I reported before:
# There is insufficient memory for the Java Runtime Environment to continue.
# pthread_getattr_np
# An error report file with more information is saved as:
#
/home/pilatl05/home_cream_885360610/CREAM885360610/condorg_sLsToCRa/pilot3/Panda_Pilot_26958_1378821926/hs_err_pid8185.log
Checking a log file for this results in:
#
# There is insufficient memory for the Java Runtime Environment to continue.
# pthread_getattr_np
# Possible reasons:
# The system is out of physical RAM or swap space
# In 32 bit mode, the process size limit was hit
# Possible solutions:
# Reduce memory load on the system
# Increase physical memory or swap space
# Check if swap backing store is full
# Use 64 bit Java on a 64 bit OS
# Decrease Java heap size (-Xmx/-Xms)
# Decrease number of Java threads
# Decrease Java thread stack sizes (-Xss)
# Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
#
# Out of Memory Error (os_linux_x86.cpp:715), pid=47903,
tid=140711218259712
#
# JRE version: 6.0_24-b24
# Java VM: OpenJDK 64-Bit Server VM (20.0-b12 mixed mode linux-amd64
compressed oops)
# Derivative: IcedTea6 1.11.11.90
# Distribution: Scientific Linux release 6.5 rolling (Carbon), package
rhel-1.62.1.11.11.90.el6_4-x86_64
--------------- T H R E A D ---------------
Current thread (0x00007ffa340da000): WatcherThread [stack:
0x0000000000000000,0x0000000000000000] [id=47947]
Stack: [0x0000000000000000,0x0000000000000000], sp=0x00007ff9e22e16b0,
free space=137413299077k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code,
C=native code)
V [libjvm.so+0x79e234]
What I'm interested in is the fact it's claiming it's running in 32bit
mode when it's on a 64bit OS with bags of memory to spare. I can't seem
to replicate this from root by doing voms-proxy-info on an atlas proxy
file so I guess by the time it gets to the end of the job, it's already
claimed too much memory. The machine itself seems OK:
[root@epgf04 ~]# free -m
total used free shared buffers cached
Mem: 96712 71137 25575 0 191 42883
-/+ buffers/cache: 28061 68650
Swap: 98302 0 98302
I can also confirm I'm running the latest voms-client3 with the flag:
[root@epgf04 ~]# more /usr/bin/voms-proxy-info
.....
# JVM options
VOMS_CLIENTS_JAVA_OPTIONS=${VOMS_CLIENTS_JAVA_OPTIONS:-"-Xmx16m"}
java $VOMS_CLIENTS_JAVA_OPTIONS -cp $VOMSCLIENTS_CP $VOMSPROXYINFO_CLASS
"$@"
[root@epgf04 ~]#
All the machines I've seen this failure on are the Dell C6145s with 48
cores per board, 100GB memory and 2TB HD space.
Anyone got any ideas or managed to get this running on C6145s??
Thanks!
Mark
|