dholm.com
msgbartop
I need this baby in a month send me nine women!
msgbarbottom

03 Dec 07 High availability network solutions

A while back I was tasked with prototyping a system for transferring large amounts of data across the Internet to a wide array of nodes without making any assumptions about how they were connected. This took some research on my part as I hadn’t really designed anything network-wise which was to hold up under extreme load or service a huge amount of simultaneous connections. During the investigative period I found a couple of links which I found to be particularly well written which I would now like to share with you.

The first one is “High-Performance Server Architecture” by Jeff Darcy. This is a good introduction into the subject and mainly covers how to manage resources. It will help you avoid the most common mistakes.

After that we have “The C10K problem” by Dan Kegel. This article digs a little deeper and offers many recommendations on how to manage the problem of handling tens of thousands of requests by leveraging existing solutions present in many of the largest *NIX operating systems. This is a typical don’t reinvent the wheel scenario where the OS already has several solutions canned and ready for you as long as you know where to look.

Finally I consulted CiteSeer and found a couple of really good articles on a bit more scientific level which handed me the last pieces of the puzzle. As I can’t divulge too much about our system in particular I’m going to leave the more specific articles out of this blog post.

To top it all off I want to share this excellent but unrelated link to “Capturing that Special Moment“.

28 Nov 07 OpenBSD woes

Based on the encouragement I received to my previous post I installed OpenBSD on the 250 again and this time I compiled a multi processor enabled kernel from current and it worked! So now I’m back on OpenBSD again and it feels great. :)

I also found that the AR5212 WiFi chipset is one of the supported chipsets in OpenBSD and as it happens I bought a D-Link DWL-G520 a couple of years ago that hasn’t been doing any good ever so I decided to install it in the 250. A huge Sun machine with a small WiFi antenna on the back looks kind of cool in my opinion. Sadly the ath is not as stable as I had hoped so it will have to be left disabled for the time being. So no replacing the Linksys just yet.
There is of course also the possibility that it is caused by a problem in -current, I’ll just have to wait and see.

MySQL seems to require a significant amount of processing power as it is a constant bottleneck when servicing pages from Wordpress. There is a very noticable latency whenever I load anything dynamic that requires data from the db whereas other pages come up instantly. I guess I’ll have to dig through the MySQL documentation on how to optimize it. Especially considering that at the time being it is very memory conservative, much more than it need be.

27 Nov 07 Operating system evaluation on Sun Enterprise 250

I installed OpenBSD 4.2 on the Sun Enterprise 250 two days ago but after having fiddled around with it a bit I realized that it didn’t come with SMP support for SPARC64. That is a huge shame because I really like OpenBSD and it felt like the perfect fit for this machine but I can’t have one CPU sitting there unutilized.

So then I went on to install Solaris 10 on it which turned out to not work at all, probably due to the Permedia Raptor (GFX-8P) not being supported. I downloaded the Solaris 9 distribution instead, thank god it hasn’t reached eol yet.. Solaris 9 worked better but after installing a bare system I realized it pretty much expects you to make a full install in order to get the management console and what not. Why do I need X11, CDE and a whole other bunch of crap just to run a web server?

After this slight disappointment I decided to give Linux a quick spin. I really don’t want to run Linux on this machine, I already have plenty of Linux boxes around but at least it comes with SMP support. Same story as Solaris 10, Linux did not agree with the Raptor and all I got was a black screen with little green men running around the screen.

As a final cause of action I tried both FreeBSD and NetBSD. Turns out NetBSD doesn’t have SMP support either and apparently it doesn’t support keyboards as well as it didn’t respond to mine at all. FreeBSD suffered from the evil Permedia curse. Now I’m back installing Solaris 9 and longing for the day when OpenBSD support SMP on SPARC64. Maybe that would be an interesting future project to take on.. *evil grin*

04 Sep 06 ATI XPRESS 200M and the HP mess

Everyone that knows me know about the troubles I’ve had with my HP Pavilion zv6148EA laptop and the built-in ATI XPRESS 200M GPU. The chipset has 128MB dedicated RAM but you can configure it to use up to 128MB of system RAM for a total of 256MB RAM. The first issue with this scheme is that there is a problem with either the GPU, the BIOS or the Video BIOS because the device always reports that it has 256MB RAM. The HP-branded ATI drivers that came with the Windows install seems to handle this just fine but when you use ATIs drivers on Linux (you can’t use the official ATI drivers on Windows as they refuse to install) the machine will deadlock unless you assign the GPU 128MB of system RAM so it actually totals 256MB. My assumption here being that HP modified ATIs drivers to properly detect the actual amount of RAM rather than fixing this damn bug properly. I have no use whatsoever for 256MB of video RAM so throwing away 128MB of valuable system RAM sucks big time.
I recently found this blog created by someone with the exact same problem as me which not only confirmed some of my fears but also asserts that this is not an ATI driver problem but rather a video BIOS bug.

20 Jul 06 Update on Linux scheduling

If you want to know more about the Linux O(1) scheduler I can highly recommend reading Inside the Linux scheduler by Tim Jones. It is an excellent introduction into the world of scheduling in the Linux kernel without unnecessary filler clogging your brain.

19 Jul 06 Firefox meets Exposé

If you are like me and think that Apple’s Exposé is the most important desktop feature since multiple desktops then you have got to install Reveal in your favorite Mozilla-based browser today! Trust me, it’s worth the effort.

Reveal

29 Jun 06 Benchmark Galore

I needed a simple benchmark for testing transfer speeds when reading multiple files in parallel from a single block device but none of the available once quite did what I wanted them to so of course I had to write one myself.

You will need a half-decent C++-compiler and the invaluable Boost C++ Libraries in order to compile the benchmark. Pass the files to be read as arguments to the applications, each file will be assigned its own thread of execution.
Download the benchmark source code here. Constructive criticism ranging from validity of the benchmark to coding style is always welcome!

29 Jun 06 A look at the latest in free real-time scheduling in the Linux kernel

Background

Real-time capabilities of the Linux kernel is not in itself a new concept. Projects such as RTAI and FSMLabs’ RTLinux have existed for quite some time and they perform very well. The problem with previous methods have been that the Linux kernel has not been able to offer the deterministic latencies that most real-time systems need so most implementations have been based on a concept of abstraction. By letting interrupts be trapped by a nano kernel, instead of Linux itself, it has been possible to add the notion of deterministic real-time threads without having to perform major surgery on Linux itself. In the world of RTAI ADEOS is used for this very purpose. The problem with these approaches, at least in the free real-time implementations, was that real-time threads had to be implemented in kernel space, making it difficult to work with since it wasn’t always possible to use existing code written for user-space deployment.

Things started changing when MontaVista released the preempt patch for the 2.4-series kernel. The preemption patches for 2.4 allowed kernel threads to voluntarily yield execution or force preemption on sleep or interrupts. This was of course not the perfect implementation because true preemption should be allowed at any time except when executing sensitive regions of code which must be allowed to complete in order not to leave the system in an invalid or partially invalid state, however the 2.4 kernel was not quite ready for this yet.
Two other important changes were Ingó Molnár’s O(1)-scheduler and low-latency patches. The O(1)-scheduler is a very fast scheduler based on table lookups. It is important to have a fast scheduling algorithm when using preemption, especially when running at a high frequency like 1000Hz, since the scheduler is going to be called a lot and must not present a bottleneck. The low-latency patch modifies large, protected, loops and similar bottlenecks in the kernel so that the scheduler can intervene at a safe place if the loop runs for a long time, a sort of voluntary yield of execution if necessary.

All these concepts were merged in the 2.5 development series of Linux and the concept of preemption was further developed in order to implement true preemption instead of a semi-voluntary one. The next big hurdle to overcome was lock breaking. Critical sections in the kernel are protected by locks so that in a multiprocessor system two, or more, processors cannot access, and possibly corrupt, the same resource at once. Similar rules apply to preemption, it is not always appropriate to preempt the current kernel thread and therefore locks are also used to prevent preemption. The problem with this scheme is that if a lock is held for too long it will delay the scheduler from being executed and consequently increasing latency. Therefore the lock-breaking done by the low-latency patch was very valuable since the work done there had already resulted in tracking down many of the bottlenecks caused by holding locks for too long and finding points where these locks could safely be preempted.
All these changes meant that Linux was suddenly capable of worst-case response times of under 10ms for real-time threads, which is good, but not good enough.

Ingó Molnár to the rescue

When the rest of us had given up on deterministic real-time scheduling in user space Ingó Molnár returned to save the day yet another time. Ingó’s patch uses mutexes instead of spinlocks to protect critical sections. Because of that critical sections can now be preempted much like the same way a user-space thread can be preempted at any time. He has also introduced a priority inheritance scheduling algorithm to avoid potential priority inversion problems that could occur when a critical section is preempted. Another really nice feature is that IRQs are presented as schedulable entities in the system and you can modify their priority by using chrt from Robert Love’s schedutils.

Results

These measurements were made on two similar 2.4GHz Intel Pentium 4 systems using Mark Hounschell’s rt-exec (1.0.3) test for finding the “deterministic real-time capabilities of a computer“. The realtime-preempt tests were run at both 250 and 1000Hz.

2.6.16 stock kernel with Gentoo patch set at 250Hz

┌──────────────────────────────────────────────────────────────────────────────┐
│ Run 00:10:36:237  NonHR-clk   14:40:22  Work:200   CPU:04 Avg:04 Max:18 Pg:0 │
│ DataPool:SHM      Exec Heart Beat Rate:250Hz      Exec Revision:1.0.3-1      │
┌──────────────────────────────────────────────────────────────────────────────┐
│          Task Sched Cpu     Intr   Late         -Interrupt Latencies (usec)- │
│ Taskname Type  Pr P Mask     Cnt    Cnt   Spare Current  Best   Worst  Determ│
│ exec      hrt   5 F  1    159237      0    3767       3     3      15      12│
│ task1     dth  29 F  1     39810      0    3989      15    14      82      68│
│ task2     dth  28 F  1     39809      0    3988      15    14      82      68│
│ task3     sth  25 F  1     79619      0    1987      13    13      82      69│
│ task4     sth  24 F  1     79618      0    1987      13    13      78      65│
│ task5     sem  23 F  1    159237      0     988       6     6      61      55│
│ task6     sem  22 F  1     79619      0    1988       5     5      34      29│
│ task7     sig  19 F  1     79619      0    1988       6     5      35      30│
│ task8     sig  18 F  1     79618      0    1987       6     6      38      32│
│ task9     hrt  17 R  1    159308      0    1988    1977  1566    2151     585│
│ task10    hrt  17 R  1    159301      0    1987    1976  1586    2150     564│
│ task11    hrn  14 R  1    159295      0     988    2981  2450    3167     717│
│ task12    hrn  14 R  1    159287      0     988    2981  2560    3157     597│
│ task13    hru  11 R  1    159280      0     988    2981  2338    3186     848│
│ task14    hru  11 R  1    159272      0     987    2981  2227    3187     960│
│ task15    bth   7 R  1    159237      0     988      12    11      75      64│
│ task16    bth   7 R  1    159237      0     988      36    35     203     168│
└──────────────────────────────────────────────────────────────────────────────┘

2.6.17 kernel with Gentoo patch set and realtime-preempt at 250Hz

┌──────────────────────────────────────────────────────────────────────────────┐
│ Run 00:12:56:243  Posix-hrt   15:34:51  Work:200   CPU:05 Avg:06 Max:07 Pg:0 │
│ DataPool:SHM      Exec Heart Beat Rate:250Hz      Exec Revision:1.0.3-1      │
┌──────────────────────────────────────────────────────────────────────────────┐
│          Task Sched Cpu     Intr   Late         -Interrupt Latencies (usec)- │
│ Taskname Type  Pr P Mask     Cnt    Cnt   Spare Current  Best   Worst  Determ│
│ exec      hrt   5 F  1    194243      0    3837       1     0     174     174│
│ task1     dth  29 F  1     48561      0   15993      40    18     450     432│
│ task2     dth  28 F  1     48561      0   15991      35    17     522     505│
│ task3     sth  25 F  1     97122      0    7993      24    14     300     286│
│ task4     sth  24 F  1     97121      0    7992      19    14     311     297│
│ task5     sem  23 F  1    194243      0    3994       9     3     294     291│
│ task6     sem  22 F  1     97122      0    7993       7     3     193     190│
│ task7     sig  19 F  1     97122      0    7992      14     7     227     220│
│ task8     sig  18 F  1     97121      0    7993      14     7     227     220│
│ task9     hrt  17 R  1     96035      0    7982      29    14     261     247│
│ task10    hrt  17 R  1     96032      0    7992      43    16     330     314│
│ task11    hrn  14 R  1    191591      0    3994      31     9     340     331│
│ task12    hrn  14 R  1    191582      0    3984      21    10     350     340│
│ task13    hru  11 R  1    191570      0    3993      29     9     308     299│
│ task14    hru  11 R  1    191562      0    3983      20     9     343     334│
│ task15    bth   7 R  1    194243      0    3992      31    15     386     371│
│ task16    bth   7 R  1    194243      0    3991      18    16     355     339│
└──────────────────────────────────────────────────────────────────────────────┘

2.6.17 kernel with Gentoo patch set and realtime-preempt at 1000Hz

┌──────────────────────────────────────────────────────────────────────────────┐
│ Run 00:07:43:731  Posix-hrt   14:37:05  Work:200   CPU:24 Avg:24 Max:26 Pg:0 │
│ DataPool:SHM      Exec Heart Beat Rate:1000Hz      Exec Revision:1.0.3-1     │
┌──────────────────────────────────────────────────────────────────────────────┐
│          Task Sched Cpu     Intr   Late         -Interrupt Latencies (usec)- │
│ Taskname Type  Pr P Mask     Cnt    Cnt   Spare Current  Best   Worst  Determ│
│ exec      hrt   5 F  1    463732      0     831       2     0     263     263│
│ task1     dth  29 F  1    115933      0    3992      62    19     381     362│
│ task2     dth  28 F  1    115933      0    3993      50    18     417     399│
│ task3     sth  25 F  1    231866      0    1992      28    14     330     316│
│ task4     sth  24 F  1    231866      0    1992      34    14     307     293│
│ task5     sem  23 F  1    463732      0     993      15     3     311     308│
│ task6     sem  22 F  1    231866      0    1992      10     3     243     240│
│ task7     sig  19 F  1    231866      0    1990      15     7     268     261│
│ task8     sig  18 F  1    231866      0    1990      17     7     271     264│
│ task9     hrt  17 R  1    224848      0    1989      52    13     222     209│
│ task10    hrt  17 R  1    224581      0    1991      34    15     208     193│
│ task11    hrn  14 R  1    445313      0     992      20    10     217     207│
│ task12    hrn  14 R  1    444804      0     991      23    10     306     296│
│ task13    hru  11 R  1    443863      0     993      39    11     329     318│
│ task14    hru  11 R  1    443400      0     980      23    10     252     242│
│ task15    bth   7 R  1    463732      0     992      36    16     357     341│
│ task16    bth   7 R  1    463732      0     991      22    15     309     294│
└──────────────────────────────────────────────────────────────────────────────┘

Notice how scheduling latencies in the FIFO scheduler have been sacrificed in order to improve latencies in the real-time scheduler and make them deterministic. Without realtime-preempt the difference between worst case latencies and deterministic latencies in the real-time scheduler is about 1:3 whereas with realtime-preempt it’s 1:1. Even though FIFO latencies are worse deterministic scheduling latencies overall are what makes this patch so important for a real-time system since it is now possible to predict precise behavior. This in turn means that you can deduce whether Linux is an appropriate tool for your real-time application without having to use unnecessarily overpowered hardware in order to guarantee deadlines.
Things are starting to look good in the future. The timing couldn’t be better considering several mobile phone manufacturers are evaluating Linux as a next-gen platform for their devices. Another field where Linux is seeing growth is in multimedia applications, anything ranging from portable media players to set top boxes seem to be Linux-powered these days and deterministic latencies are really important in these applications.

You can get Ingó Molnár’s realtime-preempt patch from his website at RedHat, http://people.redhat.com/mingo/realtime-preempt/. He updates it quite often so check in regularly.

Paul E. McKenney has written an excellent summary on most of the different approaches to real-time adaptations of the Linux kernel. You can find it at Kerneltrap under the title Linux: Realtime Approaches.

06 Jul 05 ++genkernel

genkernel is a nice little Gentoo tool designed to make your everyday life with Gentoo as pleasurable as possible, especially if compiling the kernel is something that sends shivers down your spine. It works by taking the latest kernel source stored in /usr/src/linux, combine that with a specified kernel configuration file and compile a fully working kernel with the user not having to go through the tedious process of configuring Linux manually. Kernel configurations known to work are supplied with genkernel, but you can also specify your own configuration file if you wish to build a custom kernel.
So why am I blogging about genkernel when I’m not even a genkernel developer? For the simple reason that the imminent release of version 3.2.0 adds a long awaited feature, support for the Pegasos PowerPC platform!
If you are an eager beaver and want to try it out right away I suggest you edit /etc/portage/package.unmask, create it if it doesn’t already exist, and add “>sys-kernel/genkernel-3.1.9″ to unmask the 3.2.0-prereleases. Then emerge –ask –verbose ‘>=sys-kernel/genkernel-3.2.0_pre18′ in order to install the latest Pegasos-compatible genkernel. To build a kernel simply execute:

genkernel --genzimage --kernel-config=/usr/share/genkernel/ppc/Pegasos all

Whatever kernel /usr/src/linux points to will be compiled and installed into /boot as kernelz-<kernel version>, for instance, gentoo-sources-2.6.12-r4 will be called kernelz-2.6.12-gentoo-r4. In order to boot this kernel from SmartFirmware on the Pegasos, assuming /boot is on /dev/hda1 and your Gentoo root is on /dev/hda2 issue the following command:

boot hd:0 kernelz-2.6.12-gentoo-r4 root=/dev/ram0 init=/linuxrc real_root=/dev/hda2

If you have a Radeon you might want to append something like “video=radeonfb:800×600-16″ as well. Please note that genkernel will also install a yaboot-compatible kernel called kernel-genkernel-<kernel version> and an initramfs file called initramfs-genkernel-ppc-<kernel version>, you can safely remove these on the Pegasos since it does not support yaboot at this time.
I have tried genkernel with pegasos-sources-2.6.11-r5, gentoo-sources-2.6.12 and gentoo-sources-2.6.12-r4 and it produced working kernels for all of them. If however you run into any problems please report them to me.

A special thank you goes out to Tim Yamin (aka plasmaroo) for taking my genkernel-3.1.x patch and updating it to support 3.2.0 without me even asking him.

05 Jul 05 Power tools for power programmers

Considering what I do for a living it’s high time that I blog about something appropriate for my “programming” category. Today I came upon an appropriate subject for just that purpose, a set of power tools any C++ programmer shouldn’t live without.

Today I read this article which was linked to via OSnews which explains how to use the serialization feature of the Boost C++ libraries. For those of you who didn’t already know it, serialization is the method of converting data into a binary string for storage on a hard drive or transmission over a network medium. In the world of distributed systems the process of serialization is frequently referred to as “marshalling”. This is one of the things CORBA will do for you in order to make your life easier, if you ever choose to use it.
I have not had a chance yet to use Boost’s serialization in one of my own projects. In fact, I didn’t know about it until I read the article mentioned above, so recently I wrote my own serialization routines which were far from being as clean as these are. I have however used Spirit, which is an “object-oriented recursive-descent parser generator framework”. Spirit integrates beautifully with C++ and gives an awesome Extended Backus-Normal Form-representation using nothing but standard ANSI C++ code. You have to see it to believe it.
So what are the advantages of using Boost’s serialization over writing your own routines? There are several advantages, for instance, it has seamless support for serializing STL containers. It is also very easy to use, especially if you are a C++ novice. But the biggest advantage of using Boost, at least in my opinion, is that it really increases readability of your code. The Boost classes that I have used took extremely good advantage of the powers of C++ in order to blend into your code in ways you didn’t think possible.

My advice is that if you haven’t tried Boost already give it a go. It contains powerful tools which will make your life much easier, it is peer-reviewed and comes with plenty of unit tests to maintain high quality and it integrates with your C++ code in a way you didn’t think possible (unless you are really skilled with templates in which case I salute you).