NWChem suddenly started to crash with SIGSEGV.


Clicked A Few Times
Hi nwchem users,

NWchem 6.6 which has been working with no problems till few hours ago suddenly started to crash with SIGSEGV.
Input files which worked previously now crashes.
I first suspected Openmpi, but other programs work fine even those require more memory.
I hope if someone have any solution.

The error messages are below.


Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:

Backtrace for this error:
  1. 0 0x7F8389367467
  2. 1 0x7F8389367AAE
  3. 2 0x7F8387E8566F
  4. 3 0x7F838A72C070
  5. 0 0x7F518C359467
  6. 0 0x7F7DB586D467
  7. 1 0x7F518C359AAE
  8. 1 0x7F7DB586DAAE
  9. 2 0x7F518AE7766F
  10. 3 0x7F518D71E070
  11. 2 0x7F7DB438B66F
  12. 0 0x7F831FF84467
  13. 3 0x7F7DB6C32070
  14. 1 0x7F831FF84AAE
  15. 2 0x7F831EAA266F
  16. 3 0x7F8321349070
  17. 0 0x7F920A35E467
  18. 0 0x7F4C81FDF467
  19. 1 0x7F4C81FDFAAE
  20. 1 0x7F920A35EAAE
  21. 2 0x7F4C80AFD66F
  22. 2 0x7F9208E7C66F
  23. 3 0x7F4C833A4070
  24. 3 0x7F920B723070


The NWChem is (was) working on Centos 7.2, complied with gcc4.8.5, openmpi-1.10.0.

Clicked A Few Times
Did you ever find a solution for this? We are having a similar issue, and didn't notice it until last week when the faculty that uses the software returned from sabbatical.

Thanks,

-Dj

Clicked A Few Times
Followup: Apparently NWChem will only run on the same type of CPU as it is compiled. We have a mix of AMD and Intel systems on our HPC Grid, and due to current Grid usage, the job was always getting assigned to an AMD node but the machine that things are compiled on is an Intel box.

If anyone else runs into a similar issue, this might be something to check.

fyi

Gets Around
Quote:Deej Aug 2nd 3:55 pm
Followup: Apparently NWChem will only run on the same type of CPU as it is compiled. We have a mix of AMD and Intel systems on our HPC Grid, and due to current Grid usage, the job was always getting assigned to an AMD node but the machine that things are compiled on is an Intel box.


NWChem will absolutely run on different CPU types from which it is compiled. You just need to set the compiler flags to generate code that will run on the architecture where NWChem runs. If you have many architectures, you need to either target the least common denominator or generate a multi-architecture binary (aka fat binary, which may not be supported by GCC).

The most obvious instance of NWChem supporting different CPU types is when it is cross-compiled, which is necessary on Blue Gene systems, for example. Blue Gene binaries are generated on 64-bit POWER login nodes, which supports a different ISA _and_ ABI than the compute nodes.

One way to make it so that NWChem is not portable to different x86 processors is to use "-march=native" or "-mtune=native". It is possible to build GCC such that this is the default. It may also be the default flags in the NWChem build system. If you want a portable binary, consider using "-mtune=generic", although this may reduce the performance. See https://gcc.gnu.org/onlinedocs/gcc-5.3.0/gcc/x86-Options.html for details. You may be able to find GCC flags for the union of all of the ISAs in your cluster, which may be better than "-mtune=generic".

Note that the most likely place where processor-specific instructions exist are in the BLAS library. If you dynamically link BLAS, then you should be fine as long as each machine has the appropriate BLAS library in the local LD_LIBRARY_PATH. If you link statically, you need to use a BLAS library that supports all of the processors you need to run on. Without knowing more about your build, it's hard to give better recommendations here.

Even though you are not trying to use the Intel compilers, because I work for Intel, I feel compelled to link to https://software.intel.com/en-us/articles/optimization-notice.


Forum >> NWChem's corner >> Running NWChem