ECCE 6.3 apps won't start


Jump to page 1Prev 162Next 16Last
Clicked A Few Times
Gary,

Thanks, I think I will try that. We're seeing a problem with the 32-bit version working from PC clients that logon to the cluster using X-win32 for X-windows support. An older version of X-win32 works, but newer versions don't. We think they may not be handling the 32-bit version. It works fine from Linux workstations.

So I'll try the new 64-bit version and let you know.

Thanks,
Mike

Gets Around
Hmmm, that seems odd. Worth trying the 64-bit version, but I'm not up on these X Windows emulation packages to offer any guidance.

Gary

Clicked A Few Times
Gary,

I tried the new 64-bit version, however when I try to run ebuilder I get:
builder: error while loading shared libraries: libnvidia-tls.so.290.10: cannot open shared object file: No such file or directory

I don't find this file anywhere in the ecce installation.

Mike

Gets Around
Mike,

I unknowingly packaged up the NVidia specific version of libGL.so that I run on my 64-bit RHEL 5.8 workstation. I see two possible fixes. One is that I go back to using software-only OpenGL on my workstation and the other is packaging the libnvidia libraries in a new 64-bit ECCE distribution and seeing if that works for you.

I'm going to try the second approach initially because I prefer to keep using the NVidia drivers if I can. I just updated the distributions that can be downloaded. Download the latest 64-bit one and try again. I'll cross my fingers, but I think there's a good chance this won't work without having the required graphics card. It's worth a try though. Thanks for hanging in there.

Gary

Gets Around
Mike,

I just tried building ECCE with the "software-only" OpenGL libs available via yum (no NVidia linkages). That included recompiling all of the ECCE viz-related code using these versions of the libraries and include files. Unfortunately I'm getting a segmentation fault. That means that packaging up the NVidia shared libraries as I have done is the only solution I'll have for you.

Let me know if that version works. If it doesn't then you'll need to go back to the 32-bit distribution of ECCE and try to figure out the issue with the X-win32 application. I'll also remove those OpenGL libraries completely from the 64-bit ECCE binary distribution because they wouldn't work for anyone.

Gary

Gets Around
Hi Mike,

If you can't get the latest 64-bit ECCE binary distribution to work I think it's finally time to try to build ECCE from the source code distribution on your cluster. I'm pretty certain that will resolve your issue because it will insure consistency between all the libraries that are used. I'm not at all sure though whether it will fix your X-win32 issue. Do let me know though how it goes with trying the latest ECCE 64-bit distribution. You can also just install and try the standalone builder distribution since I updated tha as well and the OpenGL problems you are having will show up in the builder alone. If it doesn't work for you I'm going to remove all the Mesa OpenGL libraries I'm currently packaging with the binary distribution because I know they won't work for others either. Finally, don't forget when you install the latest ECCE distribution to edit the $ECCE_HOME/siteconfig/site_runtime file and comment out the ECCE_MESA_EXCEPT setting. Otherwise it will fall back to using the your local OpenGL libraries rather than the ones bundled with ECCE.

Thanks,
Gary

Clicked A Few Times
Gary,

The builder application starts with the last updated 64-bit distribution you made. I do get the message:
"Xlib: extension "NV-GLX" missing on display ":11.0".", but the application comes up anyway. I'm having one of the users try it out to see if it works ok for them.

Mike

Clicked A Few Times
Gary,

The user is still having the X-win32 problem. So I guess I'll try the source build. I'm not sure if that will fix it, but if it doesn't, at least I'll know I have to get the X-win32 vendor to fix something on their end.

Mike

Clicked A Few Times
Gary,

After building ecce from source, I'm back to my original problem with the first 64-bit install, ebuilder seg-faults. The only working version has been the 32-bit version, albeit with the X-win32 issue. I guess I'll have to address that issue and use the 32-bit version.

Mike

Gets Around
Hi Mike,

I also get those "NV-GLX" warnings when I run on a RHEL virtual machine that doesn't have the actual NVidia graphics card. I can easily filter those out like I do for several other inocuous warnings and will push out a new ECCE distribution that does that shortly. I'm happy it does basically work for you though.

As far as X-win32, I see that's a product of StarNet. I agree that what you are seeing is a problem related to that product rather than ECCE. Does it work for the non-visualization ECCE applications, but just not for the Builder/Viewer? Or is it all ECCE applications including the Gateway, Organizer, etc.? Seems like OpenGL comes up as a problem with these X Windows packages. My recommendation if you don't get a timely response from that vendor is to try free Cygwin. We had a few people here use that in years past and it worked fine for them. I can't say for sure it works with the latest ECCE, but I don't see why it wouldn't (of course I'm also not sure why this would be a problem for X-win32 either so take that statement for what it's worth).

Gary

Gets Around
Hi Mike,

I didn't see this last response from you before I posted the one above.

You can also use the ECCE 64-bit binary distribution now that I bundled the libnvidia* shared libraries correct? Isn't this version now just as functional as the 32-bit binary distribution (ignoring the NV-GLX warnings that I'll filter out for you)?

Apparently not being able to build from source code is indicating that there is something that is just not right with the OpenGL libraries you are using. When I use the original Mesa libraries that can be installed via yum on my RHEL workstation, I'm seeing somethign similar unfortunately. It's only the NVidia version that seems to work for me. I'm going to try to build ECCE from source code on a RHEL virtual machine (no NVidia libs) to see if that works for me or not. Unfortunately RHEL is proving to be about the most problemmatic Linux platform (Debian, Ubuntu, etc. seem to work much better).

One last question: Does the glxgears test program work for you? If so, do a "file `which glxgears`" command to make sure it's a 64-bit version of that you are running and an "ldd `which glxgears`" to make sure it is using the OpenGL libraries you expect. Finally, if you are only having X-win32 issues with ECCE Builder/Viewer, try glxgears with that as well. That would be a much easier problem to work with StarNet on than ECCE since glxgears is a very basic OpenGL test program.

Gary

Clicked A Few Times
Gary,

Yes actually the 64-bit version that you created with the libnvidia libraries does work (with the warnings) except for the X-win32 issue, so I could use that as well and get StartNet to fix their problem. I only have the issue with Builder/Viewer, the gateway and organizer come up ok.

The glxgears program does work for me.
$ file `which glxgears`
/usr/bin/glxgears: ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV), for GNU/Linux 2.6.9, dynamically linked (uses shared libs), for GNU/Linux 2.6.9, stripped

$ ldd `which glxgears`
linux-vdso.so.1 => (0x00002aaaaaaab000)
libGL.so.1 => /usr/lib64/libGL.so.1 (0x00000037e0e00000)
libc.so.6 => /lib64/libc.so.6 (0x00000037dfe00000)
libX11.so.6 => /usr/lib64/libX11.so.6 (0x00000037e2600000)
libm.so.6 => /lib64/libm.so.6 (0x00000037e0600000)
libXext.so.6 => /usr/lib64/libXext.so.6 (0x00000037e4200000)
libXxf86vm.so.1 => /usr/lib64/libXxf86vm.so.1 (0x00000037e1a00000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00000037e0a00000)
libdl.so.2 => /lib64/libdl.so.2 (0x00000037e0200000)
libdrm.so.2 => /usr/lib64/libdrm.so.2 (0x00000037e1200000)
/lib64/ld-linux-x86-64.so.2 (0x00000037dfa00000)
libXau.so.6 => /usr/lib64/libXau.so.6 (0x00000037e2200000)
libXdmcp.so.6 => /usr/lib64/libXdmcp.so.6 (0x00000037e2a00000)

It's the same GL libraries. It also works on my older version of X-win32, but users seeing the problem have a newer version. I'll have one of them try glxgears and see if they have the same problem with that.

Mike

Gets Around
I think it will be a good data point then whether glxgears works with a newer version of X-Win32. I'm crossing my fingers it won't work, so that it's clearly a more generic OpenGL issue, but I won't be surprised either way. I'm glad though that newer X-Win32 works for non-OpenGL apps like gateway.

Gary

Gets Around
Mike,

I created a 64-bit RHEL 5.8 virtual machine and then built ECCE from source code. I verified what you saw--the stock Mesa OpenGL does not work with ECCE although it does for glxgears. The only working OpenGL seems to be the NVidia based one that I now distribute with the ECCE 64-bit binary distribution. All other Linux platforms other than RHEL I have seen are able to work with the standard Mesa OpenGL.

Gary

Clicked A Few Times
Gary,

The glxgears does work for the user with the new X-win32, so it seems particular to ECCE. We have seen this kind of thing in the past with X-win32, where it had issues with particular applications, and StarNet had to patch it. So I'll pursue it with them. I'm leaving the 64-bit version with the nvidia libraries in place since that works. I appreciate all your help with this.

Mike

Gets Around
Mike,

Do you still have your source code build of ECCE? If so, I have a code change I'd like you to try. On my 64-bit RHEL 5.8 build, it keeps the Builder from crashing while using the native Mesa OpenGL libraries instead of the NVidia ones. What this change might do is mess up the text labels that are done in the GL viz area such as the "mode" text at the top-left (those may look like gibberish). We make gl calls to do that and some that specifically have to do with initializing the bitmapped font to use seem to be related to the crashing (that would also explain why a simpler glxgears program would work while the ECCE builder wouldn't).

The file to change is $ECCE_HOME/src/viz/atomnodes/SGLattice.C. In that file search for "SO_NODE_IS_FIRST". Before the "if" block that contains that expression, put an "#if 0". Then after the closing brace for that "if" block (right before the closing brace for the whole constructor method), put an "#endif". That is, you won't to conditionally compile out that whole if block. Save the file and do a "make" in that directory. Then do a "cd .." and do another "make" to create the shared library.

Then you can actually run "ebuilder" from your build rather than having to create a binary distribution and installing it. That's because ebuilder doesn't require the ECCE server to run so it's a special case. Make sure you are pointed though to the right version (your source code build rather than one of the binary distributions) by doing a "which ebuilder". Also make sure you revert the $ECCE_HOME/siteconfig/site_runtime file in terms of the ECCE_MESA_EXCEPT variable (you'll need to uncomment that if you currently have it commented). I'd change the ebuilder script itself and uncomment the ldd line near the end just to verify you are using the OpenGL libraries under /usr/lib64.

Let me know how that works for you. If it does work then this could also be related to your X-Win32 problem and you should try that as well.

Gary


Forum >> ECCE: Extensible Computational Chemistry Environment >> General ECCE Topics
Jump to page 1Prev 162Next 16Last