11:17:37 AM PDT - Fri, Jul 10th 2015 |
|
Tried with ARMCI = MPI_TS different error...
|
So first to answer the above question. >hostname on any node in my cluster returns 'node#', where # is replaced by the integer node number. So all my nodes are named as node1, node2, node3 ... The \etc\hosts file on all nodes also has the 'Node#' as an alternate name/alias for each node.
I can 'ssh node# <cmd>' from any node to any other using either the lowercase or capitalized version of the names.
Here's the error I got using MPI_TS as my compilation choice. This time it did run properly on a single node.
nwchem: ../../ga-5-3/comex/src-mpi/comex.c:1359: comex_init: Assertion `0 == status' failed.
Program received signal SIGABRT: Process abort signal.
Backtrace for this error:
nwchem: ../../ga-5-3/comex/src-mpi/comex.c:1359: comex_init: Assertion `0 == status' failed.
Program received signal SIGABRT: Process abort signal.
Backtrace for this error:
nwchem: ../../ga-5-3/comex/src-mpi/comex.c:197: _mq_test: Assertion `0 == rc' failed.
Program received signal SIGABRT: Process abort signal.
Backtrace for this error:
nwchem: ../../ga-5-3/comex/src-mpi/comex.c:197: _mq_test: Assertion `0 == rc' failed.
Program received signal SIGABRT: Process abort signal.
Backtrace for this error:
nwchem: ../../ga-5-3/comex/src-mpi/comex.c:197: _mq_test: Assertion `0 == rc' failed.
nwchem: ../../ga-5-3/comex/src-mpi/comex.c:197: _mq_test: Assertion `0 == rc' failed.
Program received signal SIGABRT: Process abort signal.
Backtrace for this error:
Program received signal SIGABRT: Process abort signal.
Backtrace for this error:
- 0 0x7FF702970777
- 0 0x7FF1EA30D777
- 1 0x7FF1EA30DD7E
- 1 0x7FF702970D7E
- 2 0x7FF1E9C5FD3F
- 2 0x7FF7022C2D3F
- 3 0x7FF7022C2CC9
- 3 0x7FF1E9C5FCC9
- 4 0x7FF7022C60D7
- 4 0x7FF1E9C630D7
- 5 0x7FF7022BBB85
- 5 0x7FF1E9C58B85
- 6 0x7FF7022BBC31
- 6 0x7FF1E9C58C31
- 0 0x7FC24BA6B777
- 1 0x7FC24BA6BD7E
- 2 0x7FC24B3BDD3F
- 3 0x7FC24B3BDCC9
- 4 0x7FC24B3C10D7
- 5 0x7FC24B3B6B85
- 6 0x7FC24B3B6C31
- 0 0x7F77D791F777
- 1 0x7F77D791FD7E
- 2 0x7F77D7271D3F
- 3 0x7F77D7271CC9
- 4 0x7F77D72750D7
- 5 0x7F77D726AB85
- 6 0x7F77D726AC31
- 0 0x7F829398B777
- 1 0x7F829398BD7E
- 2 0x7F82932DDD3F
- 3 0x7F82932DDCC9
- 4 0x7F82932E10D7
- 5 0x7F82932D6B85
- 6 0x7F82932D6C31
- 0 0x7FEB98228777
- 1 0x7FEB98228D7E
- 2 0x7FEB97B7AD3F
- 3 0x7FEB97B7ACC9
- 4 0x7FEB97B7E0D7
- 5 0x7FEB97B73B85
- 6 0x7FEB97B73C31
- 7 0x4B71A07 in _mq_test at comex.c:197
- 8 0x4B73154 in comex_barrier at comex.c:1208
- 9 0x4B735CF in comex_init at comex.c:1395
- 10 0x4B7369F in comex_init_args at comex.c:1411
- 11 0x4B6E7E5 in PARMCI_Init_args at armci.c:178
- 12 0x4B3A42A in install_nxtval
- 13 0x4B3A1CD in tcgi_alt_pbegin
- 14 0x4B3A235 in tcgi_pbegin
- 15 0x4B38F1B in pbeginf_
- 16 0x54551D in nwchem at nwchem.F:84
- 7 0x4B73622 in comex_init at comex.c:1359 (discriminator 1)
- 8 0x4B7369F in comex_init_args at comex.c:1411
- 9 0x4B6E7E5 in PARMCI_Init_args at armci.c:178
- 7 0x4B71A07 in _mq_test at comex.c:197
- 8 0x4B73154 in comex_barrier at comex.c:1208
- 9 0x4B735CF in comex_init at comex.c:1395
- 10 0x4B7369F in comex_init_args at comex.c:1411
- 11 0x4B6E7E5 in PARMCI_Init_args at armci.c:178
- 10 0x4B3A42A in install_nxtval
- 11 0x4B3A1CD in tcgi_alt_pbegin
- 12 0x4B3A42A in install_nxtval
- 12 0x4B3A235 in tcgi_pbegin
- 13 0x4B3A1CD in tcgi_alt_pbegin
- 13 0x4B38F1B in pbeginf_
- 14 0x4B3A235 in tcgi_pbegin
- 14 0x54551D in nwchem at nwchem.F:84
- 15 0x4B38F1B in pbeginf_
- 16 0x54551D in nwchem at nwchem.F:84
- 7 0x4B73622 in comex_init at comex.c:1359 (discriminator 1)
- 7 0x4B71A07 in _mq_test at comex.c:197
- 7 0x4B71A07 in _mq_test at comex.c:197
- 8 0x4B73154 in comex_barrier at comex.c:1208
- 8 0x4B7369F in comex_init_args at comex.c:1411
- 9 0x4B735CF in comex_init at comex.c:1395
- 8 0x4B73154 in comex_barrier at comex.c:1208
- 10 0x4B7369F in comex_init_args at comex.c:1411
- 9 0x4B735CF in comex_init at comex.c:1395
- 10 0x4B7369F in comex_init_args at comex.c:1411
- 11 0x4B6E7E5 in PARMCI_Init_args at armci.c:178
- 9 0x4B6E7E5 in PARMCI_Init_args at armci.c:178
- 11 0x4B6E7E5 in PARMCI_Init_args at armci.c:178
- 12 0x4B3A42A in install_nxtval
- 12 0x4B3A42A in install_nxtval
- 10 0x4B3A42A in install_nxtval
- 13 0x4B3A1CD in tcgi_alt_pbegin
- 13 0x4B3A1CD in tcgi_alt_pbegin
- 11 0x4B3A1CD in tcgi_alt_pbegin
- 14 0x4B3A235 in tcgi_pbegin
- 12 0x4B3A235 in tcgi_pbegin
- 14 0x4B3A235 in tcgi_pbegin
- 13 0x4B38F1B in pbeginf_
- 15 0x4B38F1B in pbeginf_
- 15 0x4B38F1B in pbeginf_
- 14 0x54551D in nwchem at nwchem.F:84
- 16 0x54551D in nwchem at nwchem.F:84
- 16 0x54551D in nwchem at nwchem.F:84
|