Transterrestrial Musings  


Amazon Honor System Click Here to Pay

Space
Alan Boyle (MSNBC)
Space Politics (Jeff Foust)
Space Transport News (Clark Lindsey)
NASA Watch
NASA Space Flight
Hobby Space
A Voyage To Arcturus (Jay Manifold)
Dispatches From The Final Frontier (Michael Belfiore)
Personal Spaceflight (Jeff Foust)
Mars Blog
The Flame Trench (Florida Today)
Space Cynic
Rocket Forge (Michael Mealing)
COTS Watch (Michael Mealing)
Curmudgeon's Corner (Mark Whittington)
Selenian Boondocks
Tales of the Heliosphere
Out Of The Cradle
Space For Commerce (Brian Dunbar)
True Anomaly
Kevin Parkin
The Speculist (Phil Bowermaster)
Spacecraft (Chris Hall)
Space Pragmatism (Dan Schrimpsher)
Eternal Golden Braid (Fred Kiesche)
Carried Away (Dan Schmelzer)
Laughing Wolf (C. Blake Powers)
Chair Force Engineer (Air Force Procurement)
Spacearium
Saturn Follies
JesusPhreaks (Scott Bell)
Journoblogs
The Ombudsgod
Cut On The Bias (Susanna Cornett)
Joanne Jacobs


Site designed by


Powered by
Movable Type
Biting Commentary about Infinity, and Beyond!

« And Now For Something Completely Different | Main | This Might Make Me Reassess Rumsfeld »

In Linux Hell

My Red Hat server has not had X running on it for months. Whenever I would upgrade, it would be unable to start the X server, and would give indications of a hardware problem. I bought a new video card for it, with no joy.

I recently decided to upgrade to Fedora. It stopped as it was trying to load the X upgrades, with a fatal error, telling me that it was either bad media (the CD passed a media check), inadequate disk space (it's an almost-empty eighty gig drive), or a hardware problem. I replaced the motherboard with a different kind, thinking that the problem might be in the AGP section. Same result.

The worst thing is that since it only did a partial install, I can't boot it any more, except in rescue mode from the CD. All of my data is still there when I mount the partition, so I'm trying to figure out how to back it up and just do a clean install, in hopes that this will finally get me around whatever the problem is. Does anyone have any thoughts as to other options, or just what the issue might be?

[Update a few minutes later]

Is there some way to get it to bypass the X installation, so I can at least complete the Fedora upgrade, and then try to fix X separately? For instance, if I do an install instead of an upgrade, is there some way I could deselect those packages, but still preserve the data in /home?

Posted by Rand Simberg at December 14, 2004 09:23 AM
TrackBack URL for this entry:
http://www.transterrestrial.com/mt-diagnostics.cgi/3259

Listed below are links to weblogs that reference this post from Transterrestrial Musings.
Linux Hell
Excerpt: Rand Simberg is having serious Linux issues, originating, it seems, with some hardware problem or another. At this point, the main problem is retrieving data from a half-done upgrade to Fedora. Ouch.
Weblog: XTremeBlog
Tracked: December 14, 2004 09:26 PM
Comments

There are actually a large number of bootable Linux rescue CDs available, and Knoppix also nicely doubles for this task. With these CDs you should be able to connect to the network and access the data on your partition, which would allow you to back the data up to another machine.

Could you give us more details about the X hardware problem?

Posted by Neil Halelamien at December 14, 2004 09:36 AM

Two possible solutions:

1. Try to install an earlier version of Fedora on
that partition.

2. Use Partition Magic or something of that sort
(I really love the semi-free Ranish Partition
Manager) to create another partition and
install another version of Fedora or Red Hat
or Slackware in the second partition; then
mount the first partition and save your data.

Try 'em in that order, I suggest.

As for what specifically has gone wrong with your
installation, I can't guess. I've loaded Fedora
1,2, and 3 on my system without problems. Sorry.

Posted by mike shupp at December 14, 2004 09:40 AM

As I said, I can rescue with the Fedora CD, and mount the partition. The main reason I can't do a backup is that I don't have any other Linux machine running (I've got a different set of problems with my firewall, which I've been unable to revive since the CPU fan died and killed the processor). I've got an older Debian box that just needs a power supply, so I guess I'll fire that up and hope it has enough drive to do a backup via SCP.

Posted by Rand Simberg at December 14, 2004 09:49 AM

Boot with the Fedora CD in rescue, mount the partition, then edit /etc/inittab:

change:
id:5:initdefault:

to:
id:3:initdefault:

Then CTRL+D and reboot into multiuser console mode. If multiuser doesn't work try '1' which is single user mode.

It is possible to check the CD-ROMs to try to catch errors before an install. There are utilities to do this from Windows or Linux. Alternatively you can check them before doing the install as well.

To know how much disk space you have left, use:
df -h

Posted by Gojira at December 14, 2004 11:30 AM

Thank you, Gojirra. I'll be trying all that.

I did check all the media before I did the install, though, and I rechecked Disk 2 (the one on which it hangs) after the failure.

By the way, there's an old hackers' trick to log in to a machine on which you've forgotten the root password. I forget what the boot init is for it, but I'm trying to resurrect one that I haven't used in a couple years, and can't get in. If you, or anyone knows, I'd appreciate it.

Posted by Rand Simberg at December 14, 2004 11:47 AM

Changing the default initlevel doesn't help. When it boots with "3" I get a message "Kernel panic: No init found. Try passing init= option to kernel."

Posted by Rand Simberg at December 14, 2004 12:30 PM

"Kernel panic: No init found. Try passing init= option to kernel. Or insert MS Windows XP setup disc and press any key to continue."

Posted by Josh "Hefty" Reiter at December 14, 2004 12:45 PM

If you can at least get the machine up and attached to the network, you can always FTP the files from the linux machine to any other machine in the network. There are at least a couple of free ftp servers for windows, or, if you're running win2k or better, you can use the built-in one. While kludgy, this will at least get your data off the machine so you can start from scratch.

Posted by Wesley at December 14, 2004 01:24 PM

Windows 2000 has an FTP server? How do you enable it?

It's been so long since I've used FTP that I didn't think about it, but I guess it would be OK for an internal file transfer.

Posted by Rand Simberg at December 14, 2004 02:03 PM

Rand,

Windows 2000 does have a built-in FTP server. You can find instructions for how to set it up at http://techrepublic.com.com/5100-6268-1034519.html. Beyond that, if you have any questions, feel free to drop me an email. I have experience with both linux and windows, and would be glad to help.

Posted by Wesley at December 14, 2004 02:07 PM

The hack is a kernel option. At lilo or Grub startup add 's' or '-s' to the command line for the kernel. Can't reboot now to check... but I think lilo redhat gives you an option to type ctrl-x to edit the command line options to the linux kernel for grub... and tab for lilo... though lilo hasn't been there since... what 7.1?

If I were you I'd install Fedora from scratch. You can do the partition dance but you don't have to. Just boot into emergency and write down what partitions are what. Then on the install part use those partitions for that and do not format the home partition. Should get you at least back to a bootable machine with an intact home partition.

Were you not able to remove the X packages with RPM... that way you would have been able to skip them in the upgrade. Usually I look for the X packages with 'rpm -q -a | grep X' or just rpm -e XFree86 and when it complains about dependancies just add those to the command line then you can get them all.

I did the upgrade dance once after installing the gnome helix stuff... really bad idea. I ended up doing the first above option... redhat 7.3 - 8 I think.

Posted by at December 14, 2004 02:50 PM

For the initrd issue try this. Sounds right even for RedHat.

http://lists.debian.org/debian-user/2000/04/msg00239.html

Posted by at December 14, 2004 02:54 PM

Were you not able to remove the X packages with RPM... that way you would have been able to skip them in the upgrade. Usually I look for the X packages with 'rpm -q -a | grep X' or just rpm -e XFree86 and when it complains about dependancies just add those to the command line then you can get them all.

Rand's issues with his Linux install seem more serious than that since he gets the kernel panic complaining of a missing init even when booting in console mode. X should only be an issue much later on the boot process.

I agree with your suggestion about the initrd issues.

I also agree that a reinstall is a good idea, failing that. If you have a separate /home partition this is indeed easy. Just format the / partition and do a clean install.

If you do not have a separate partition for /home, I suppose the easiest solution is to connect a spare hard disk to your computer and make a mirror tarball of /home in the spare hard disk. Then format /, do a clean install and restore /home from the spare hard disk mirror.

Posted by Gojira at December 14, 2004 03:48 PM

To change the root password, try booting with your RescueCD and then use chroot /mnt/sysimage (or whatever the mountpoint is called). You should now be able to just use 'passwd' to change the root password.

NOTE: Knoppix (3.6) probably isn't your best option to recover if you use LVM volumes.. I had a problem myself when upgrading from FC2 to FC3 (using apt-get).. Knoppix has the /dev directory read-only on the CD, and LVM needs to be able to create the device nodes in /dev/volumegroup/logicaldrive, which doesn't work on read-only drives :-)

Posted by at December 15, 2004 02:09 AM

To change the root password, try booting with your RescueCD...

That would be on the machine that happens not to have a CD drive, right? ;-)

See what I mean about Linux hell? It's always something...

I guess I should go out and pick up a couple old used drives and throw them in so all of my machines are CDable.

Posted by Rand Simberg at December 15, 2004 05:00 AM

Oh, I should add that, unfortunately, /home doesn't have its own partition. The only separate partitions (besides swap) are /tmp, /boot and /var.

Posted by Rand Simberg at December 15, 2004 06:27 AM

Or even a couple of *new* CD drives. They're getting so inexpensive they could practically be used as prizes in boxes of Cap'n Crunch. IIRC the shipping for the last one I bought was more than the price of the drive... there's a huge glut because everyone is switching to DVD.

As for the rest of it: I'm truly sorry the Linux distributors are doing such a thoroughly lousy job. There's no reason anybody should have to go through any of that nonsense; that's just as bad as what Micro$hit is doing with their OS.

Posted by 42nd SSD at December 15, 2004 09:22 AM

Rand - as far as the X stuff goes, I can only wish you luck. with regard to your other query, you need to get in to single user mode, aka runlevel 1.

if the machine in question is running lilo, with no lilo password, you'll want to hit tab as soon as the lilo prompt comes up. tab should list the kernel images that are available. so, to boot the machine in single user mode, you'll put in single. for a default redhat install that would be "linux single" - not sure what convention debian uses for it's lilo stanzas, but you should be able to extrapolate.

if you're using grub, it's a little more complicated. when the grub prompt comes up, press a key to interrupt the boot process. it should give you a list of all the kernels. highlight the image you want to boot, and press e. this should bring up a new page. the second line should be a list of all the boot time parameters grub is passing to the kernel. highlight it and press e again. append the word single to the end of the line. press enter. press b.

either of these should get you logged in as root, at which point you should set the password to something you can remember.

Posted by Mat Fulghum at December 15, 2004 09:23 AM

I actually had a very similar experience a couple of years ago. At that point I had actually installed X from source, and XFree86 had some ideas about where files should be tha didn't mesh well with Red Hat's ideas (which were founded in the FHS standard). Somewhere along the way a directory got replaced by a symlink, and that made Anaconda die a horrible death. Is the old X version installed from a Red Hat RPM or did it come from somewhere else? If it came from another RPM, maybe it can still be removed using RPM. Otherwise, if you build from source or used binary tarballs from XFree86,perhaps manually deleting the entire /usr/X11 file heirarchy, removing everything XFree86 from the RPM database (maybe something like 'rpm -e allmatches --justdb XFree86*' would do the trick). I sure can't promise that this will work, but it migth be worth a shot since your other option as of yet seems to be a full reinstall.

Good luck!

Posted by Per at December 15, 2004 10:59 AM

aww crud... just realized I started using lvm and don't have a seperate partition for home either... ah well. The best recovery method if you are really sol is another drive with another installation on it that you can boot to or do a fresh install on to. Otherwise I'd say shinglecan the current install boot cd get your data scp to somewhere and just reinstall. It's always good to have mutliple computers at home...

Posted by at December 15, 2004 12:26 PM

Anaconda gets very angry if your RPM database isn't squeaky-clean. If you have any dependancy errors or the like it goes downhill in a hurry. That said, if you keep things clean it can do minor miracles (I just upgraded my laptop directly from RH 7.3 to Fedora Core 3 and it worked great. Just had to upgrade the NVidia driver to get OpenGL back afterwards).

For the problem described in particular, to recover your data: plug the HDD into a working Windows machine. Download a Windows program called "ExploreExt2FS" - it'll open a normal working Explorer window showing any Ext2/3 partitions and you can drag your files on to the Windows machine's drive no problem. Then reformat the drive and do a fresh install of Fedora Core.

Posted by Ian S. at December 15, 2004 12:40 PM

I'm guessing, based on some of the comments here, that if I don't have a hardware problem, the issue is a tarball install of X many moons ago that isn't playing nicely with the packages.

That Windows program for reading ext file systems sounds pretty slick. That may solve the problem (though I still want to get that other Debian machine up).

Posted by Rand Simberg at December 15, 2004 12:57 PM

Now you know why I've always run Debian and always had a separate /home partition on my linux boxen.

Posted by Phil Fraering at December 15, 2004 02:25 PM

The error on the second disk was most likely an intermittent read error on the CD. I've had this happen occassionally, but I've done a lot of installs. Either try again with the same disks, or burn a second copy and use that.

If you made it through the initial menus during the installation, you don't have an X problem - the Redhat and Fedora install processes both start an X server early in the installation process.

The reason you don't have an initrd is that this is built as the last step of the installation. initrd is a compressed filesystem that contains drivers needed for the kernel to boot that are not built into the kernel (typically, SCSI disk drivers). It's not hard to build an initrd file, but beyond a short explanation. The way Redhat/Fedora are set up, you can't boot without an initrd.

I would try a few simple things to get the machine up. If you don't have any data stored on the root partition, tell Fedora to format the root partition during the install. This will remove any potential conflicts with previous installations. If you have it, formatting the usr partition is also a good idea. However, make sure you *don't* format your data partition. If your data is on the root partition, consider booting the rescue disk and manually deleting all the system directories on your hard disk (/etc, /usr, /bin, /sbin, /var, /lib, but *not* /home).

You can do a simple install by selecting a custom installation and doing a minimum install. For Fedora Core 1, this only uses the first CD. However, it installs almost nothing - no X server, no compilers, very little more than a bootable system.

As for the system where you have forgotten the root password, you need to say what the boot manager is. Is it LILO, or Grub?

Posted by Jim at December 16, 2004 08:02 AM

As for the system where you have forgotten the root password, you need to say what the boot manager is. Is it LILO, or Grub?

It's LILO, but I solved that one finally by booting with "init=/bin/sh" which gave me a shell as root from which I could then change the password.

I do have data in /home, and it's not its own partition, but I may try deleting the system directories as you suggest (though I'd also want to back up stuff from /etc, like my Samba configuration file).

I've tried repeated installations from the media, and checked it repeatedly. It always tells me that it's fine, and it always hangs as it's installing X, at the same point.

Posted by Rand Simberg at December 16, 2004 08:09 AM

This may be a very simplistic solution, but if all else fails I would move the working power supply to the Debian box, as well as setting the HD with your data as a slave and attaching it to the Debian box also. Boot it up, make appropriate changes in /etc/fstab and make a directory for it in /mnt/, and then try to mount the HD and suck all the data off of it.

I've had endless problems whenever I've tried to "upgrade" Linux systems. For me, I've always found it easier to save my data to another source, do a "clean" install, and then move my old data back.

Good luck.

Posted by tcobb at December 16, 2004 10:14 AM

If you are sure it is hanging while installing X, you can tell it not to install X. You are probably doing one of the normal installations (Workstation?). Do a custom installation instead. You will then get to a page that lets you select broad categories of things to install. I think one of these categories is the X server. Unselect that and continue the installation.

If you are adventurous, you can try to figure out what error is occuring during the install. Hitting Ctl-Alt-F1, -F2, -F3, and (maybe -F4, it's been a while) switches to various consoles that display some of the installation messages. Ctl-Alt-F7 should get you back to the X windows installation screen. I can't recall ever actually solving a problem this way, however.

Your best bet is still a clean install by deleting the system directories. You can save your configuration files by rename /etc to, e.g., /etc.old rather than deleting it. If you have the disk space, you can rename other system directories as well, but there usually isn't much to save in them, with the possible exceptions of /usr/local and /boot.

Posted by Jim at December 16, 2004 06:59 PM

If you are sure it is hanging while installing X, you can tell it not to install X. You are probably doing one of the normal installations (Workstation?). Do a custom installation instead. You will then get to a page that lets you select broad categories of things to install. I think one of these categories is the X server. Unselect that and continue the installation.

If you are adventurous, you can try to figure out what error is occuring during the install. Hitting Ctl-Alt-F1, -F2, -F3, and (maybe -F4, it's been a while) switches to various consoles that display some of the installation messages. Ctl-Alt-F7 should get you back to the X windows installation screen. I can't recall ever actually solving a problem this way, however.

Your best bet is still a clean install by deleting the system directories. You can save your configuration files by rename /etc to, e.g., /etc.old rather than deleting it. If you have the disk space, you can rename other system directories as well, but there usually isn't much to save in them, with the possible exceptions of /usr/local and /boot.

Posted by Jim at December 16, 2004 07:02 PM

I didn't want to do an install, custom or otherwise. I was doing an upgrade, because I wanted to preserve my data in /home. If it's possible to do that with an "install," I'm all ears.

Posted by Rand Simberg at December 16, 2004 07:30 PM

It's possible to do an installation without overwriting the disk contents. That's what you need to do if you delete the system directories.

Start by booting the Fedora installation as a rescue disk. Look at your fstab file (in something like /mnt/sysimage/etc/fstab) and write down the name, device, and filesystem for each partition (columns 2, 1, and 3, respectively). Be sure to include the swap partition. Once you have done this, you can reboot into the installation process.

If you are upgrading from a Redhat installation, the first column of /etc/fstab will have wierd looking entries like LABEL=root, and you can't get the device name from the file. Instead, type df. You will see a list of mounted paritions. The ones under /mnt/sysimage are those for your old installation. The device names are in the first column.

This time, do an clean install rather than an upgrade. Early in the installation, you will be asked about partitioning the disk. You are given two options, doing it automatically, or doing it with disk druid (I think that's what they call it). Choose the later option. You will see a graphical presentation of the disks and partitions in your system. In turn, select each of the partitions that you wrote down above (using the device name from column 1 of /etc/fstab) and click the edit button. Enter the appropriate mount point (from column 2 of etc/fstab) and make sure the correct filesystem is selected (it should be, and you can't change it without reformatting), and that you don't check the box to format the partition (if it is checked, uncheck it). Click OK, and do the remaining partitions, if any. When you are done, each disk partition should have a directory name next to it. I believe there is also a column that indicates which firesystems are going to be formatted. Only the swap partition

Now when you install, it should install from scratch. The problem with this is that it will leave random files and programs from the previous installation, which is why you should delete or rename the system directories first. It will also overwrite any configuration files, so it would be a good idea to save the /etc tree before doing this.

If you have a CD-R in the computer, you might consider making a backup of /etc (and /home, if you can break it into small enough pieces) before doing the install. To do this, you would need to boot something like a Knoppix CD.

I assume you are updating a fairly old version of the OS, so you probably have ext2 file systems. You should consider switching to ext3 file systems after a successful installation. All you need to do is run, e.g., tune2fs -j /dev/sda1 on each file system, change the ext2 in /etc/fstab to ext3, and reboot.

This is all from memory. If you have trouble following it, or if you have questions about the process, let me know. I should have an install disk somewhere I can boot up and walk through the process.

Posted by at December 16, 2004 08:17 PM


Post a comment
Name:


Email Address:


URL:


Comments: