Running NWChem on 2 nodes takes more time than a single node

Click here for full thread
Clicked A Few Times
I am running Ubuntu 14.04.3 server with NWChem (6.3) installed from the repositories. Hardware specs are 6-core cpu, 64GB RAM, 1GbE, SSD on each node. Walltime for a test simulation is ~50% more on 2 nodes than on a single node; cpu times are almost the same.

I didn't expect a super scaling from a GbE network, but performance is rather disappointing even on just 2 nodes. I measured (with nload) a peak rate of 160MBit/s which is way below the 1000MBit/s limit of the onboard card. So, throughput does not seem to be the problem. I also noted that some cores are not running at 100% all the time (especially on the second node). Is there any chance the limiting factor is the high latency of Ethernet networks? What could i do to run NWChem on a slow Ethernet network?

Thanks in advance,