Category Archives: Personal

…As The Future Repeats Today

– Good morning, how may I ruin your life today?
– How about messing with my ATA-drives like you did a month ago?
– Hang on to your suspenders, here we go!

hdb: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdb: dma_intr: error=0x84 { DriveStatusError BadCRC }
ide: failed opcode was: unknown
hdb: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdb: dma_intr: error=0x84 { DriveStatusError BadCRC }
ide: failed opcode was: unknown
ide0: reset: success
EXT3-fs error (device hdb1): ext3_free_blocks_sb: bit already cleared for block 2045422
Aborting journal on device hdb1.
ext3_abort called.
EXT3-fs error (device hdb1): ext3_journal_start_sb: Detected aborted journal
Remounting filesystem read-only
EXT3-fs error (device hdb1) in ext3_free_blocks_sb: Journal has aborted
EXT3-fs error (device hdb1) in ext3_free_blocks_sb: Journal has aborted
EXT3-fs error (device hdb1) in ext3_reserve_inode_write: Journal has aborted
EXT3-fs error (device hdb1) in ext3_truncate: Journal has aborted
EXT3-fs error (device hdb1) in ext3_reserve_inode_write: Journal has aborted
EXT3-fs error (device hdb1) in ext3_orphan_del: Journal has aborted
EXT3-fs error (device hdb1) in ext3_reserve_inode_write: Journal has aborted
__journal_remove_journal_head: freeing b_committed_data

As it seems I simply just cannot catch a break with ATA-drives recently. This scenario happened to my brand new Maxtor drive today, although I highly suspect the IDE-controller on the motherboard is the culprit in this case.
The partitions on the drive were suddenly remounted read-only by the kernel. After I rebooted my system parted wouldn’t even read the partition table and gpart appeared to hang while scanning the drive. I decided to upgrade my 2.6.11 kernel to 2.6.12 just in case it was a driver problem, although I very much doubted it as the motherboard has an Intel 801 controller which is a quite mature chipset. After the reboot I was presented with the kernel output shown above, strangely it would now read the partition table. Thanks to this I was able to run fsck on both the partitions on that drive (/home and /public), it did find a lot of problems but most of them were probably just created by files which had only been in part created when the read-only remount took place.
So far I haven’t found any corrupted files on the drive and I consider myself lucky. I’m going to spend the rest of the day burning DVDs. Luckily it’s raining outside so it’s not like my spirits were high to begin with.

Hard drives suck and then you die

So I ordered a new hard drive for my server about a week ago. It was about time to upgrade, actually I should have done it long ago but hard drives make a large dent in your wallet. It arrived on Friday which made me happy since I would have the entire weekend to install it, little did I expect that a 10min job would actually eat up most of Friday night and the entire Saturday.
The idea was to replace my Western Digital Caviar 80GB drive with my brand new Maxtor 200GB drive, retire the old 30GB IBM drive and replace it with the Caviar, a perfect plan. I’ve had the Caviar for 2-3 years and the main reason I went with that model was because it came highly recommended from several sources that I trusted. It served my home directory on the server as well as the Subversion and CVS repositories. The IBM drive hosted my root partition and it was getting really crammed since I also kept the web root and some other things on it.
By now I think most of you have figured out what has happened to me, and if you haven’t I’m about to get to the point so don’t worry. Before we get down to the gory details I would like to point out that if there are two things I trust in my server it’s the motherboard, a trusty Tyan Tiger S1834 which have yet to cause me a single problem, and the WD Caviar which, like I said, came highly recommended.
I powered down the server like any sane admin would do before replacing an IDE drive and continued on to mounting the new Maxtor drive. My plan was to migrate all data on the Caviar to the Maxtor, repartition the Caviar and then migrate my server from the IBM to the Caviar. Now, unlike an actual system administrator working for some company with lots of money I cannot afford a lot of redundancy, in fact, all the redundancy I could afford was to run daily backups of my CVS and Subversion repositories as well as my personal mail. Like I pointed out earlier these were stored on the Caviar so naturally backups were made to the IBM drive and with a measly 30GB drive which was already holding a lot of other stuff and factoring in that I didn’t own a DVD-burner until recently you can understand why I chose to limit the backups to just these three things. Back to the story, after having mounted the new drive, I went on to booting my server. I pressed the power button and all of a sudden I hear this weird clicking noise coming from one of my hard drives, naturally I blamed the new drive. Quickly I proceeded to turn the power off, ensured that all cables were actually connected properly and then turned power back on again, the clicking noise did not go away. The BIOS screen popped up (yes, my server is a PC unless you didn’t realize this back when I was talking about it being a Tyan motherboard) and to my shock only the IBM and Maxtor drives were detected! The clicking went on a couple of more times and then I could hear the drive spin down and the clicking stopped. I rebooted once more but the process repeated itself exactly like the first time around. I was starting to panic, but I thought/hoped that it could be a problem with the IDE controller on the motherboard, I can always replace a motherboard but replacing a hard drive with data on it is a whole different ballpark. Again, no matter which channel I used same story.
I feel that this story is already long enough so I’m going to just cut to the conclusion right now, besides, it’s getting late and I do want to get some sleep after this terrible day.
The Caviar is dead, no question about it. With the exception of my CVS and Subversion repositories and personal mail, thank god!, I have lost a lot of things. I’m still trying to recall what I was actually keeping on the drive that I might miss. One thing I do remember and that I now have lost (unless by some miracle the hard drive would come back to life) are all of my pictures that I took at Smart Networks Developer Forum Europe.

So what have I learned from this horrible experience?
Making backups are great, and I love hdup, without it I would have lost everything, now I just lost some important files but far from all of it. I would especially like to thank Mikael Karlsson for introducing me to hdup in the first place. From now on I’m going to make backups on DVD-RW at a more regular interval and restructure my file system a bit so that I have better control over what is actually getting backed up. Never again am I going to blindly trust a hard drive no matter how good someone says it is, hard drives suck.
In the end things could have been a lot worse. I have accepted the fact that I lost some things that cannot be replaced and others that can and even though it’s been a bad day a new one starts tomorrow and then another and another until eventually I will never have to worry about something as silly as data stored on a hard drive ever again.

Peace out.

Third’s a charm

As I hope most of you know has been down 90% of the time the last couple of months. This started soon after I added my blog which more or less killed my blogging spirit. After having clicked the OK-button of the “The mail server cannot be reached” dialog a couple of hundred times I got fed up and decided to move it all to my own server, and in the process I also registered and 🙂
This “new and improved” site will be based on WordPress unlike the old site which was a merger between my own code and pivot log. I will add some of the original content to the new site but since I hadn’t finished the original site anyway most of it will be brand new. This of course all depends on how flexible WordPress is, but so far I’m pretty impressed with its capabilities.
Right now availability will probably be a bit flaky because I’m in the process of upgrading my server, but I expect it to be back to 99% uptime by next week.