Running NWChem on 2 nodes takes more time than a single node


Click here for full thread
Gets Around
Is it a sufficiently demanding calculation that you would expect it to make efficient use of the assigned hardware resources? Really small calculations will show poor or even negative scaling at 12 cores even when all cores are on the same motherboard.

Try using tcpdump to see how many bytes and packets are transferred during your test run. If there are a lot of smallish messages, I think you are fundamentally limited by latency.

Apart from your scaling woes, you may wish to install version 6.6 from source. There have been a lot of bug fixes and enhancements since 6.3. The Ubuntu package won't be linked with a high performance BLAS either.