NWChem over Gigabit Ethernet


Click here for full thread
Forum Vet
There is a lot more communication happening in the gradient evaluation, so that will be an important factor. I don't know how to reduce latency, not a hardware/computer scientist. You may want to ask a computer scientist in your department. It's GigE, there are better networks available. You are not running the same number of processes on each node either. Depending on how the data is laid out in memory (i.e. is distributed over the nodes) you may be creating an asymmetric communication pattern (one node needing to do a lot more then the other).

Bert


Quote: Jul 15th 12:35 pm
Hello and thanks for your reply.

I run 6 processes on node 1 and 4 processes on node 2. Both nodes use DDR 3 memory with the same bandwidth. I never observed any swapping. Further I use the "direct" directive in all runs, so I think the disk speed cannot be limiting.

As mentioned, scaling during SCF if very good. I ran one job on node 1 and the same on both nodes. On both nodes the wall clock time decreased with a divisor of 1.6. Since both nodes together have (theoretical) 34,66 GHz, which is 1.65 times more than node 1 alone, this is almost perfect. However, during an optimization the total wall clock time decreases only with a divisor of 1.3. I assume that this is due to the gradient evaluation and the thereby occurring drop in cpu usage.

The problem exist also if I run only 1 process per node, so maybe latency is limiting. A ping request takes usually about 0.08 ms from one node to the other. Do you have any experience whether this is to long or how else I may check the latency? Or do you have any ideas how to decrease the impact of high latency?

Thanks.