I've spent like 16 hours upgrading my linux workstation in the last few days. My workstation consists of a base Debian install that I've added a bunch of stuff on top of. My custom stuff usually isn't debianized because I'm too lazy to bother. I build my own kernel, winex, a large number of games, some custom tools, and other randomness. Most of /etc is hand-done and modified outside of debconf since I know better what I want than some random package maintainer.

The debian part starts out as stable but I upgrade portions of it to testing, unstable, (and now experimental) as the situation warrants. Sometimes this requires intentionally breaking debian dependencies due to badly designed packages or a particular need to have a specific version of something on the box. This makes it a challenge to perform upgrades on the machine.

Since this box is also used for games like EverQuest under winex that means that I need to have a functional sound subsystem along with hardware OpenGL. Historically I've installed alsa packages to satisfy sound dependencies, marked them in a 'hold' state, and then built and installed my own version of alsa over the top. For OpenGL I used to install Mesa then install the nvidia GL modules over the top of it and hand-hack it into functionality. When Xfree 4.1 came out I created a meta package to satisfy libGL dependencies such that I didn't need to worry about the hand hacking. I did the same for the two versions of Xfree 4.2 that I've run on the box.

Earlier this week I decided to bite the bullet and do a substantial upgrade on the machine. I have been very impressed with the 2.6 kernel, running it on both of my laptops for awhile now, so figured it was past time to get the main machine on 2.6. My nvidia driver was over a year and possibly two years old. The newer versions I'd tried out over that period of time had various issues so I was past due to upgrade it to a new one. At the same time I wanted to put a new version of X on the box for no good reason.

The first thing I did was install module-init-tools, and updated versions of e2fsprogs, mount, alsa-libs, alsa-utils, and a couple of other things needed for a 2.6 kernel. Then I built the kernel. I actually did all of this remotely from work so it was a crap shoot that it would actually boot up and be functional when I rebooted it. Luckily it worked. After checking dmesg I made a couple of tweaks to the config and had my kernel. Then I installed Xfree 4.3 from the experimental tree. The X maintainer has finally broken X up into separate libGL, libGLU, and dri packages. This convinced me to finally packagize the nvidia driver into a real replacement for xlibmesa-gl and xlibmesa-gl-dev rather than just creating a meta package.

I spent all of Thursday evening doing that. By midnight I had xlibnvidia-gl and xlibnvidia-gl-dev packages to replace the xlibmesa-gl(-dev) packages. Friday morning I verified that the packages installed correctly. I am still toying with having a package that dynamically builds the kernel module and dri module but for now I'm doing that out of band. Saturday morning I was finally sitting in front of the workstation so I could test my changes. Unfortunately I was only building the kernel module at first and forgot about the dri module. That cost me a couple of hours figuring out why X wouldn't start. Sheer genius that was.

Once X was up and running I had to do a little tweaking to put a new asound.state in place so I had audio starting up right and various other things. Then it was time to get winex running again. I'd previously had problems with its thread usage and libc6 2.3.2. So I figured I'd try the latest release of winex 3.2.1 and if see the problem had been resolved. (Adding module-init-tools upgraded libc6 to 2.3.3.) Of course this was a non-trivial task. I got a weird error so tried building the 3-1 branch and the trunk. All of them failed in a somewhat similar fashion. I assumed it was a problem with gcc 3.3 because I haven't used it much compared to 2.95. Unfortunately that wasn't it. Nor was it an error in the C code. Nor was it any of a hundred other things I thought it might be. Turns out nvidia ships an old version of glext.h. Once I replaced the out of date nvidia headers with the ones from xlibmesa-gl-dev, the code from the trunk built and ran without a problem. Then I discovered that the threading problem still existed. Luckily I found a solution by setting an environment variable that tricks libc6 into working.

After launching EverQuest I was quickly reminded of a couple of irritating unimplemented features in winex that spew many errors. I went back to my 3.1 tree and figured out what I had done to quiet it down a bit. I implemented a similar but uglier fix in the current code I was using and rebuilt it. I also tried using 3.2.1 but unlike the trunk code, that branch still didn't build. I fixed the errors in the C code enough to get it to build but introduced a bug in the winsock code. I gave up on that and went back to the functional trunk code. EverQuest seemed to be running smoothly on that.

After that I decided to make sure my other games still worked. Soldier of Fortune and Tribes 2 were smooth as silk. Quake 3 and Heretic 2 wouldn't run though. I eventually figured out that I'd hardcoded which libGL file to use in my configuration for those games and the info was now incorrect. Quake 3 loaded fine after that (except for no sound but then I haven't had sound in that game since alsa 0.4something.)

All in all this was a major pain in the ass and took much effort. I frankly don't know how the average person ever manages to get stuff like this working. I do OS design for a living. I design and implement debian packages from scratch just about every week. I build kernels and work on parts of the kernel with about the same frequency. I've been using the nvidia linux driver since I first picked up a TNT card. I've been hacking on winex to make it work in my system since 2.0. It took a substantial amount of effort for me to get all of this crap working. About the only thing I can say for it is that at least with enough effort it is possible to get it working. This is in direct contrast to my Windows box where I had to just give up on the sound card after a few hours because it wouldn't initialize. I had to go back to using the sound built into the motherboard. When all you've got is binary code you are pretty much screwed if it doesn't work.

At this point the only part of what I've done that is even useful to anyone else is a patched version of the nvidia package that does a smarter job of not blowing away existing GL and glx libraries on a box, available here. The xlibmesa-gl replacements are kind of worthless at the moment because I still need to make a dri package replacement with the nvidia module or a 'compile at install time' module. Hopefully I'll get that done this weekend.