Eigenvectors failed to converge in nwChem 6.8


Just Got Here
Dear all,

we are currently trying to deploy the 6.8 on release on our cluster using the Intel compilers, MKL and Intel MPI from 2018 Update 2. When running our examples, e.g. the C240 Buckyball example from the wiki, the run terminates with the following error message from the PDSYEVD routine:

[...]
PDSTEDC parameter number 10 had an illegal value
{ 1, 28}: On entry to
PDSTEDC parameter number 10 had an illegal value
[0] Received an Error in Communication: (-10) 0: ga_pdsyevd: eigenvectors failed to converge:
application called MPI_Abort(comm=0x84000004, -10) - process 0
[...]

repeated for all ranks.

Does someone maybe have an idea what the issue could be here?

Greetings
André

Forum Regular
Hi, Can you post or send your complete input and output files ? Please also send your build settings.
If the files are large, you can email it to me directly. My email is below.

Thanks.

Best,
-Niri
niri.govind@pnnl.gov

Just Got Here
Dear Niri,

thanks a lot for your offer. The input file is from the NWChem Wiki: http://nwchemgit.github.io/images/Input_c240.nw
But it also happens with other input files we have tested.

I have send you one of the output files.

Thanks and Greetings
André

Just Got Here
Would maybe anyone else using NWChem 6.8 with Intel Compilers, Intel MPI and MKL be so kind as to test the above input file from the Wiki? I'm curious if this is something specific we are seeing.

Forum Vet
Quote:Agemuend Jun 1st 1:30 am
Would maybe anyone else using NWChem 6.8 with Intel Compilers, Intel MPI and MKL be so kind as to test the above input file from the Wiki? I'm curious if this is something specific we are seeing.


Here are the input and output files
https://github.com/nwchemgit/nwchem/wiki/c240_631gs.nw
https://github.com/nwchemgit/nwchem/wiki/c240_631gs.output

Just Got Here
Thanks! So this is indeed a local problem. Do you happen to have the build variables still at hand? Which version of Intel are you using?

Greetings
André

Forum Vet
export BLAS_SIZE=8
export BLASOPT="-lmkl_intel_ilp64 -lmkl_core -lmkl_sequential -lpthread -lm -ldl"
export SCALAPACK="-lmkl_scalapack_ilp64 -lmkl_intel_ilp64 -lmkl_core -lmkl_intel_thread -lmkl_blacs_intelmpi_ilp64 -lpthread -liomp5 -lm"
export SCALAPACK_SIZE=8

These variables are known to work with Intel 2017 and 2018

Forum Vet
Fix available
We manage to reproduce this error (it shows up only in the latest 2018 MKL versions).

A fix is now available. Please use the update Global Arrays library

https://github.com/edoapra/ga/releases/tag/v5.7


Forum >> NWChem's corner >> Feedback