running Nwchem in parallel


Gets Around
Hi,

I'm trying to run nwchem in parallel but facing a lot of issue regarding node distribution. My HPC architecture is 1 node=16 threads.
Using the command
mpirun -np 16 nwchem abc.nw
I can run nwchem with 16 threads comfortably(with single node). But if I need more nodes I'm trying
mpirun -np 32 nwchem abc.nw
(2 nodes) command. When I manually checked on each nodes I found that only single node is under usage, other node is staying idle.

Is there any other way I can specify the job distribution with nodes more productively ?

Forum Regular
Are your CPUs configured for hyper threading?
Was your MPI implementation configured with the correct settings for the scheduler that you are using?

Gets Around
Quote:Sean Jul 25th 11:29 am
Are your CPUs configured for hyper threading?
Was your MPI implementation configured with the correct settings for the scheduler that you are using?

Hi,
Yes for both of them I guess. As I have successfully ran some other programs using the same command.
I had a word with the HPC admins, they wants to know if nwchem can distribute jobs in 2 or more nodes with that commamd(1node=16threads, here). It is distributing to 16 threads with that commamd so far.
I have tried (for 2 node) but failed
 mpirun -np 16 -bynode nwchem abc.nw

Forum Regular
It is my understanding that NWChem is not responsible for distributing the MPI processes among the nodes. This should be handled by the MPI that you are using (and the job scheduler since the job scheduler should be controlling available resources). You should be able to use whatever options you want for mpirun, and NWChem shouldn't care.

This sounds to me like a problem between MPI and the job scheduler/resource manager. However, I could be wrong, and it is curious if you are saying that

mpirun -np 32 someother-program input.file

results in that program running 16 MPI processes each on 2 nodes, whereas

mpirun -np 32 nwchem abc.nw

results in 32 MPI processes running on one node (with the other node idle). I'm afraid I can't be of further help at this point.

Gets Around
I'm getting following error when trying for multiple node:
/bin/sh: module: line 1: syntax error: unexpected end of file
/bin/sh: error importing function definition for `BASH_FUNC_module'
./dimer.db: Permission denied
hdbm_open: open of ./dimer.db failed 0
rtdb_seq_open: hdbm failed to open file ./dimer.db
Last System Error Message from Task 1:: Illegal seek
Last System Error Message from Task 3:: Illegal seek
Last System Error Message from Task 4:: Illegal seek


Is it because of path integration issue or something else ? Any suggestions would be very helpful.


Forum >> NWChem's corner >> Running NWChem