Cuda compiling


Click here for full thread
Just Got Here
I am trying to run on computer cluster with GPU. However even I have OMP_NUM_THREADS=1 and MKL_NUM_THREADS=1 the NWCHEm still generates several threads on a node. Why is that? Does someone see it and how I can control extra threads not to be generated?