Endless TCGMSG run


Click here for full thread
  • Guest -
hi, i run nwchem 6.0 on linux ia32.
it runs on two machines alone and in parallel with the machine itself as only host, but when i try to run it from one on both i get no errors of any kind, and nwchem runs fine (again), it's in the process tables on both but it doesn't stop. the jobs i try are very fast, less than a minute on a single machine, yet they continue forever in parallel. i use the TCGMSG. this is the output i get:

bash-3.1$ parallel nwchem etoh_time.nw
tmp = /home/user/pdir/nwchem.p
Creating: host=pc-008, user=user,
file=/home/user/nwchem-6.0/bin/LINUX/nwchem, port=51876
/home/user/nwchem-6.0/bin/LINUX/nwchem, len=38
etoh_time.nw, len=12
  -master, len=7
pc-008.initlab.org, len=18
    51876, len=5
2, len=1
4, len=1
0, len=1
0, len=1
Creating: host=lab100-pc012, user=user,
file=/home/user/nwchem-6.0/bin/LINUX/nwchem, port=44494
/home/user/nwchem-6.0/bin/LINUX/nwchem, len=38
etoh_time.nw, len=12
  -master, len=7
pc-008, len=6
44494, len=5
2, len=1
4, len=1
1, len=1
2, len=1
argument 1 = etoh_time.nw
argument 2 = -master
argument 3 = pc-008.initlab.org
argument 4 = 51876
argument 5 = 2
argument 6 = 4
argument 7 = 0
argument 8 = 0
ARMCI configured for 2 cluster nodes. Network protocol is 'TCP/IP Sockets'.

and it doesn't end,
just keeps running forever, no matter how many times i try.

i can even log with ssh from one machine to another and then run it or run it in parallel with single machine (the one running) but that's it.