ECCE 6.3 apps won't start


Jump to page 1Prev 162Next 16Last
Gets Around
Hi Mike,

If you can't get the latest 64-bit ECCE binary distribution to work I think it's finally time to try to build ECCE from the source code distribution on your cluster. I'm pretty certain that will resolve your issue because it will insure consistency between all the libraries that are used. I'm not at all sure though whether it will fix your X-win32 issue. Do let me know though how it goes with trying the latest ECCE 64-bit distribution. You can also just install and try the standalone builder distribution since I updated tha as well and the OpenGL problems you are having will show up in the builder alone. If it doesn't work for you I'm going to remove all the Mesa OpenGL libraries I'm currently packaging with the binary distribution because I know they won't work for others either. Finally, don't forget when you install the latest ECCE distribution to edit the $ECCE_HOME/siteconfig/site_runtime file and comment out the ECCE_MESA_EXCEPT setting. Otherwise it will fall back to using the your local OpenGL libraries rather than the ones bundled with ECCE.

Thanks,
Gary

Clicked A Few Times
Gary,

The builder application starts with the last updated 64-bit distribution you made. I do get the message:
"Xlib: extension "NV-GLX" missing on display ":11.0".", but the application comes up anyway. I'm having one of the users try it out to see if it works ok for them.

Mike

Clicked A Few Times
Gary,

The user is still having the X-win32 problem. So I guess I'll try the source build. I'm not sure if that will fix it, but if it doesn't, at least I'll know I have to get the X-win32 vendor to fix something on their end.

Mike

Clicked A Few Times
Gary,

After building ecce from source, I'm back to my original problem with the first 64-bit install, ebuilder seg-faults. The only working version has been the 32-bit version, albeit with the X-win32 issue. I guess I'll have to address that issue and use the 32-bit version.

Mike

Gets Around
Hi Mike,

I also get those "NV-GLX" warnings when I run on a RHEL virtual machine that doesn't have the actual NVidia graphics card. I can easily filter those out like I do for several other inocuous warnings and will push out a new ECCE distribution that does that shortly. I'm happy it does basically work for you though.

As far as X-win32, I see that's a product of StarNet. I agree that what you are seeing is a problem related to that product rather than ECCE. Does it work for the non-visualization ECCE applications, but just not for the Builder/Viewer? Or is it all ECCE applications including the Gateway, Organizer, etc.? Seems like OpenGL comes up as a problem with these X Windows packages. My recommendation if you don't get a timely response from that vendor is to try free Cygwin. We had a few people here use that in years past and it worked fine for them. I can't say for sure it works with the latest ECCE, but I don't see why it wouldn't (of course I'm also not sure why this would be a problem for X-win32 either so take that statement for what it's worth).

Gary

Gets Around
Hi Mike,

I didn't see this last response from you before I posted the one above.

You can also use the ECCE 64-bit binary distribution now that I bundled the libnvidia* shared libraries correct? Isn't this version now just as functional as the 32-bit binary distribution (ignoring the NV-GLX warnings that I'll filter out for you)?

Apparently not being able to build from source code is indicating that there is something that is just not right with the OpenGL libraries you are using. When I use the original Mesa libraries that can be installed via yum on my RHEL workstation, I'm seeing somethign similar unfortunately. It's only the NVidia version that seems to work for me. I'm going to try to build ECCE from source code on a RHEL virtual machine (no NVidia libs) to see if that works for me or not. Unfortunately RHEL is proving to be about the most problemmatic Linux platform (Debian, Ubuntu, etc. seem to work much better).

One last question: Does the glxgears test program work for you? If so, do a "file `which glxgears`" command to make sure it's a 64-bit version of that you are running and an "ldd `which glxgears`" to make sure it is using the OpenGL libraries you expect. Finally, if you are only having X-win32 issues with ECCE Builder/Viewer, try glxgears with that as well. That would be a much easier problem to work with StarNet on than ECCE since glxgears is a very basic OpenGL test program.

Gary

Clicked A Few Times
Gary,

Yes actually the 64-bit version that you created with the libnvidia libraries does work (with the warnings) except for the X-win32 issue, so I could use that as well and get StartNet to fix their problem. I only have the issue with Builder/Viewer, the gateway and organizer come up ok.

The glxgears program does work for me.
$ file `which glxgears`
/usr/bin/glxgears: ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV), for GNU/Linux 2.6.9, dynamically linked (uses shared libs), for GNU/Linux 2.6.9, stripped

$ ldd `which glxgears`
linux-vdso.so.1 => (0x00002aaaaaaab000)
libGL.so.1 => /usr/lib64/libGL.so.1 (0x00000037e0e00000)
libc.so.6 => /lib64/libc.so.6 (0x00000037dfe00000)
libX11.so.6 => /usr/lib64/libX11.so.6 (0x00000037e2600000)
libm.so.6 => /lib64/libm.so.6 (0x00000037e0600000)
libXext.so.6 => /usr/lib64/libXext.so.6 (0x00000037e4200000)
libXxf86vm.so.1 => /usr/lib64/libXxf86vm.so.1 (0x00000037e1a00000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00000037e0a00000)
libdl.so.2 => /lib64/libdl.so.2 (0x00000037e0200000)
libdrm.so.2 => /usr/lib64/libdrm.so.2 (0x00000037e1200000)
/lib64/ld-linux-x86-64.so.2 (0x00000037dfa00000)
libXau.so.6 => /usr/lib64/libXau.so.6 (0x00000037e2200000)
libXdmcp.so.6 => /usr/lib64/libXdmcp.so.6 (0x00000037e2a00000)

It's the same GL libraries. It also works on my older version of X-win32, but users seeing the problem have a newer version. I'll have one of them try glxgears and see if they have the same problem with that.

Mike

Gets Around
I think it will be a good data point then whether glxgears works with a newer version of X-Win32. I'm crossing my fingers it won't work, so that it's clearly a more generic OpenGL issue, but I won't be surprised either way. I'm glad though that newer X-Win32 works for non-OpenGL apps like gateway.

Gary

Gets Around
Mike,

I created a 64-bit RHEL 5.8 virtual machine and then built ECCE from source code. I verified what you saw--the stock Mesa OpenGL does not work with ECCE although it does for glxgears. The only working OpenGL seems to be the NVidia based one that I now distribute with the ECCE 64-bit binary distribution. All other Linux platforms other than RHEL I have seen are able to work with the standard Mesa OpenGL.

Gary

Clicked A Few Times
Gary,

The glxgears does work for the user with the new X-win32, so it seems particular to ECCE. We have seen this kind of thing in the past with X-win32, where it had issues with particular applications, and StarNet had to patch it. So I'll pursue it with them. I'm leaving the 64-bit version with the nvidia libraries in place since that works. I appreciate all your help with this.

Mike

Gets Around
Mike,

Do you still have your source code build of ECCE? If so, I have a code change I'd like you to try. On my 64-bit RHEL 5.8 build, it keeps the Builder from crashing while using the native Mesa OpenGL libraries instead of the NVidia ones. What this change might do is mess up the text labels that are done in the GL viz area such as the "mode" text at the top-left (those may look like gibberish). We make gl calls to do that and some that specifically have to do with initializing the bitmapped font to use seem to be related to the crashing (that would also explain why a simpler glxgears program would work while the ECCE builder wouldn't).

The file to change is $ECCE_HOME/src/viz/atomnodes/SGLattice.C. In that file search for "SO_NODE_IS_FIRST". Before the "if" block that contains that expression, put an "#if 0". Then after the closing brace for that "if" block (right before the closing brace for the whole constructor method), put an "#endif". That is, you won't to conditionally compile out that whole if block. Save the file and do a "make" in that directory. Then do a "cd .." and do another "make" to create the shared library.

Then you can actually run "ebuilder" from your build rather than having to create a binary distribution and installing it. That's because ebuilder doesn't require the ECCE server to run so it's a special case. Make sure you are pointed though to the right version (your source code build rather than one of the binary distributions) by doing a "which ebuilder". Also make sure you revert the $ECCE_HOME/siteconfig/site_runtime file in terms of the ECCE_MESA_EXCEPT variable (you'll need to uncomment that if you currently have it commented). I'd change the ebuilder script itself and uncomment the ldd line near the end just to verify you are using the OpenGL libraries under /usr/lib64.

Let me know how that works for you. If it does work then this could also be related to your X-Win32 problem and you should try that as well.

Gary

Gets Around
Hi Mike,

This is looking like a promising fix. If you have time to try the source code change I mentioned and see if that works for you, that would be great. If you don't have time then on Thursday I'm hoping to get out new distributions for you to try. This fix will let you use your local 64-bit OpenGL libraries instead of the ones distributed with ECCE. Also, I'm updating the ones shipped with ECCE to be standard Mesa OpenGL rather than the NVidia specific ones that just happened to work. Since these will be the same version of OpenGL libraries used by glxgears, at least that will make it a bit more consistent if the StarNet developers end up needing to make a fix. If I really get lucky then this source code fix in ECCE will allow me to build OpenGL from source code like is done for a 32-bit build of ECCE, but not currently for a 64-bit build.

Gary

Gets Around
Mike,

There are new ECCE distributions for download with the hopeful fix for the 64-bit OpenGL issue. Both binary distributions and the source code distribution have been updated. Download and install the latest binary distribution and I'm confident you'll be able to run the Builder now using either:

1. The OpenGL libraries shipped with ECCE (no longer NVidia based, but the generic RHEL ones obtained via the yum package manager)
2. Your local OpenGL libraries (you'll need to edit the $ECCE_HOME/siteconfig/site_runtime file and comment out the ECCE_MESA_OPENGL line to use your own local libraries--remember you can edit the $ECCE_HOME/scripts/ebuilder script and temporarily uncomment the ldd line to verify which you are using)
3. OpenGL libraries compiled from the Mesa 6.5.3 distribution bundled with ECCE (you'd need to build the ECCE source code distribution in order to compile these libraries)

This resolves the differences between how 32-bit OpenGL and 64-bit OpenGL is handled in ECCE all traced back to that fix in the SGLattice.C code. All associated documentation for building ECCE and the build_ecce script have been updated. The prerequisite check for OpenGL has also been removed since it's no longer mandatory to have OpenGL pre-installed for ECCE since working libraries are now bundled with the distribution and used by default.

The one thing I'm not confident about is whether this will resolve your X-Win32 issue displaying OpenGL output on Windows PCs. I have a feeling it won't fix that because you also have this issue with 32-bit ECCE 6.3 binary distribution and all that was done was rationalizing the 64-bit distribution to be equivalent to the 32-bit one. But, it's worth another try and at this point if it doesn't work I can't think of anything else to try on this end. My guess is that what ECCE does with OpenGL is just use more/different features than the glxgears test application does and that is what is causing issues for X-Win32. One thing we do that is more advanced are calls to glReadPixels and glPixelStorei that have to do with reading and writing to the render area that other GL apps don't typically do. Again, I'd spend a bit of time seeing if Cygwin works better for you if you think a solution from StarNet is going to take longer than you want.

Let me know how this latest version works for you. Are you using ECCE for students in classes or research efforts? Do you have time to get this straightened out or did you need it working a week ago?

Gary

Clicked A Few Times
Gary,

Sorry I haven't been able to look at this the last couple of days. I read your last few posts, it looks like I should try the latest binary distribution you've made first. I'll do that and let you know. This is not being used for classes, just some researchers want to use, so I'm not under heavy pressure to get it working right away.

Mike

Gets Around
Just as well you waited anyway Mike. Try the latest binary distribution and let me know how it works for you. At the very least it should be a better starting point for the StarNet folks because I'd feel strange telling them to debug an issue when the OpenGL libraries being used were for an NVidia hardware card when that card isn't actually present on the host running ECCE. Now it's generic Mesa OpenGL and it's probably just specific features of that ECCE uses that X-Win32 isn't working with.

Gary


Forum >> ECCE: Extensible Computational Chemistry Environment >> General ECCE Topics
Jump to page 1Prev 162Next 16Last