2:25:34 AM PDT - Wed, May 15th 2013 |
|
I tried out your proposals.
Setting ARMCI_DEFAULT_SHMMAX lower than 4096 let the jobs crash immediately after submitting with:
in the output-file:
"rank:23 hostname:f37 pid:22579):ARMCI DASSERT fail. ../../ga-5-1/armci/src/devices/openib/openib.c:armci_pin_contig_hndl():1142 cond:(memhdl->memhndl!=((void *)0))"
....
and in the SGE Error-File:
Last System Error Message from Task 16:: Cannot allocate memory
Last System Error Message from Task 20:: Cannot allocate memory
...
Something with memory is wrong and I guess it's a system setting. But I have no idea anymore what to set different than above mentioned.
Ursula
|