dholm.com
msgbartop
I need this baby in a month send me nine women!
msgbarbottom

29 Jun 06 A look at the latest in free real-time scheduling in the Linux kernel

Background

Real-time capabilities of the Linux kernel is not in itself a new concept. Projects such as RTAI and FSMLabs’ RTLinux have existed for quite some time and they perform very well. The problem with previous methods have been that the Linux kernel has not been able to offer the deterministic latencies that most real-time systems need so most implementations have been based on a concept of abstraction. By letting interrupts be trapped by a nano kernel, instead of Linux itself, it has been possible to add the notion of deterministic real-time threads without having to perform major surgery on Linux itself. In the world of RTAI ADEOS is used for this very purpose. The problem with these approaches, at least in the free real-time implementations, was that real-time threads had to be implemented in kernel space, making it difficult to work with since it wasn’t always possible to use existing code written for user-space deployment.

Things started changing when MontaVista released the preempt patch for the 2.4-series kernel. The preemption patches for 2.4 allowed kernel threads to voluntarily yield execution or force preemption on sleep or interrupts. This was of course not the perfect implementation because true preemption should be allowed at any time except when executing sensitive regions of code which must be allowed to complete in order not to leave the system in an invalid or partially invalid state, however the 2.4 kernel was not quite ready for this yet.
Two other important changes were Ingó Molnár’s O(1)-scheduler and low-latency patches. The O(1)-scheduler is a very fast scheduler based on table lookups. It is important to have a fast scheduling algorithm when using preemption, especially when running at a high frequency like 1000Hz, since the scheduler is going to be called a lot and must not present a bottleneck. The low-latency patch modifies large, protected, loops and similar bottlenecks in the kernel so that the scheduler can intervene at a safe place if the loop runs for a long time, a sort of voluntary yield of execution if necessary.

All these concepts were merged in the 2.5 development series of Linux and the concept of preemption was further developed in order to implement true preemption instead of a semi-voluntary one. The next big hurdle to overcome was lock breaking. Critical sections in the kernel are protected by locks so that in a multiprocessor system two, or more, processors cannot access, and possibly corrupt, the same resource at once. Similar rules apply to preemption, it is not always appropriate to preempt the current kernel thread and therefore locks are also used to prevent preemption. The problem with this scheme is that if a lock is held for too long it will delay the scheduler from being executed and consequently increasing latency. Therefore the lock-breaking done by the low-latency patch was very valuable since the work done there had already resulted in tracking down many of the bottlenecks caused by holding locks for too long and finding points where these locks could safely be preempted.
All these changes meant that Linux was suddenly capable of worst-case response times of under 10ms for real-time threads, which is good, but not good enough.

Ingó Molnár to the rescue

When the rest of us had given up on deterministic real-time scheduling in user space Ingó Molnár returned to save the day yet another time. Ingó’s patch uses mutexes instead of spinlocks to protect critical sections. Because of that critical sections can now be preempted much like the same way a user-space thread can be preempted at any time. He has also introduced a priority inheritance scheduling algorithm to avoid potential priority inversion problems that could occur when a critical section is preempted. Another really nice feature is that IRQs are presented as schedulable entities in the system and you can modify their priority by using chrt from Robert Love’s schedutils.

Results

These measurements were made on two similar 2.4GHz Intel Pentium 4 systems using Mark Hounschell’s rt-exec (1.0.3) test for finding the “deterministic real-time capabilities of a computer“. The realtime-preempt tests were run at both 250 and 1000Hz.

2.6.16 stock kernel with Gentoo patch set at 250Hz

┌──────────────────────────────────────────────────────────────────────────────┐
│ Run 00:10:36:237  NonHR-clk   14:40:22  Work:200   CPU:04 Avg:04 Max:18 Pg:0 │
│ DataPool:SHM      Exec Heart Beat Rate:250Hz      Exec Revision:1.0.3-1      │
┌──────────────────────────────────────────────────────────────────────────────┐
│          Task Sched Cpu     Intr   Late         -Interrupt Latencies (usec)- │
│ Taskname Type  Pr P Mask     Cnt    Cnt   Spare Current  Best   Worst  Determ│
│ exec      hrt   5 F  1    159237      0    3767       3     3      15      12│
│ task1     dth  29 F  1     39810      0    3989      15    14      82      68│
│ task2     dth  28 F  1     39809      0    3988      15    14      82      68│
│ task3     sth  25 F  1     79619      0    1987      13    13      82      69│
│ task4     sth  24 F  1     79618      0    1987      13    13      78      65│
│ task5     sem  23 F  1    159237      0     988       6     6      61      55│
│ task6     sem  22 F  1     79619      0    1988       5     5      34      29│
│ task7     sig  19 F  1     79619      0    1988       6     5      35      30│
│ task8     sig  18 F  1     79618      0    1987       6     6      38      32│
│ task9     hrt  17 R  1    159308      0    1988    1977  1566    2151     585│
│ task10    hrt  17 R  1    159301      0    1987    1976  1586    2150     564│
│ task11    hrn  14 R  1    159295      0     988    2981  2450    3167     717│
│ task12    hrn  14 R  1    159287      0     988    2981  2560    3157     597│
│ task13    hru  11 R  1    159280      0     988    2981  2338    3186     848│
│ task14    hru  11 R  1    159272      0     987    2981  2227    3187     960│
│ task15    bth   7 R  1    159237      0     988      12    11      75      64│
│ task16    bth   7 R  1    159237      0     988      36    35     203     168│
└──────────────────────────────────────────────────────────────────────────────┘

2.6.17 kernel with Gentoo patch set and realtime-preempt at 250Hz

┌──────────────────────────────────────────────────────────────────────────────┐
│ Run 00:12:56:243  Posix-hrt   15:34:51  Work:200   CPU:05 Avg:06 Max:07 Pg:0 │
│ DataPool:SHM      Exec Heart Beat Rate:250Hz      Exec Revision:1.0.3-1      │
┌──────────────────────────────────────────────────────────────────────────────┐
│          Task Sched Cpu     Intr   Late         -Interrupt Latencies (usec)- │
│ Taskname Type  Pr P Mask     Cnt    Cnt   Spare Current  Best   Worst  Determ│
│ exec      hrt   5 F  1    194243      0    3837       1     0     174     174│
│ task1     dth  29 F  1     48561      0   15993      40    18     450     432│
│ task2     dth  28 F  1     48561      0   15991      35    17     522     505│
│ task3     sth  25 F  1     97122      0    7993      24    14     300     286│
│ task4     sth  24 F  1     97121      0    7992      19    14     311     297│
│ task5     sem  23 F  1    194243      0    3994       9     3     294     291│
│ task6     sem  22 F  1     97122      0    7993       7     3     193     190│
│ task7     sig  19 F  1     97122      0    7992      14     7     227     220│
│ task8     sig  18 F  1     97121      0    7993      14     7     227     220│
│ task9     hrt  17 R  1     96035      0    7982      29    14     261     247│
│ task10    hrt  17 R  1     96032      0    7992      43    16     330     314│
│ task11    hrn  14 R  1    191591      0    3994      31     9     340     331│
│ task12    hrn  14 R  1    191582      0    3984      21    10     350     340│
│ task13    hru  11 R  1    191570      0    3993      29     9     308     299│
│ task14    hru  11 R  1    191562      0    3983      20     9     343     334│
│ task15    bth   7 R  1    194243      0    3992      31    15     386     371│
│ task16    bth   7 R  1    194243      0    3991      18    16     355     339│
└──────────────────────────────────────────────────────────────────────────────┘

2.6.17 kernel with Gentoo patch set and realtime-preempt at 1000Hz

┌──────────────────────────────────────────────────────────────────────────────┐
│ Run 00:07:43:731  Posix-hrt   14:37:05  Work:200   CPU:24 Avg:24 Max:26 Pg:0 │
│ DataPool:SHM      Exec Heart Beat Rate:1000Hz      Exec Revision:1.0.3-1     │
┌──────────────────────────────────────────────────────────────────────────────┐
│          Task Sched Cpu     Intr   Late         -Interrupt Latencies (usec)- │
│ Taskname Type  Pr P Mask     Cnt    Cnt   Spare Current  Best   Worst  Determ│
│ exec      hrt   5 F  1    463732      0     831       2     0     263     263│
│ task1     dth  29 F  1    115933      0    3992      62    19     381     362│
│ task2     dth  28 F  1    115933      0    3993      50    18     417     399│
│ task3     sth  25 F  1    231866      0    1992      28    14     330     316│
│ task4     sth  24 F  1    231866      0    1992      34    14     307     293│
│ task5     sem  23 F  1    463732      0     993      15     3     311     308│
│ task6     sem  22 F  1    231866      0    1992      10     3     243     240│
│ task7     sig  19 F  1    231866      0    1990      15     7     268     261│
│ task8     sig  18 F  1    231866      0    1990      17     7     271     264│
│ task9     hrt  17 R  1    224848      0    1989      52    13     222     209│
│ task10    hrt  17 R  1    224581      0    1991      34    15     208     193│
│ task11    hrn  14 R  1    445313      0     992      20    10     217     207│
│ task12    hrn  14 R  1    444804      0     991      23    10     306     296│
│ task13    hru  11 R  1    443863      0     993      39    11     329     318│
│ task14    hru  11 R  1    443400      0     980      23    10     252     242│
│ task15    bth   7 R  1    463732      0     992      36    16     357     341│
│ task16    bth   7 R  1    463732      0     991      22    15     309     294│
└──────────────────────────────────────────────────────────────────────────────┘

Notice how scheduling latencies in the FIFO scheduler have been sacrificed in order to improve latencies in the real-time scheduler and make them deterministic. Without realtime-preempt the difference between worst case latencies and deterministic latencies in the real-time scheduler is about 1:3 whereas with realtime-preempt it’s 1:1. Even though FIFO latencies are worse deterministic scheduling latencies overall are what makes this patch so important for a real-time system since it is now possible to predict precise behavior. This in turn means that you can deduce whether Linux is an appropriate tool for your real-time application without having to use unnecessarily overpowered hardware in order to guarantee deadlines.
Things are starting to look good in the future. The timing couldn’t be better considering several mobile phone manufacturers are evaluating Linux as a next-gen platform for their devices. Another field where Linux is seeing growth is in multimedia applications, anything ranging from portable media players to set top boxes seem to be Linux-powered these days and deterministic latencies are really important in these applications.

You can get Ingó Molnár’s realtime-preempt patch from his website at RedHat, http://people.redhat.com/mingo/realtime-preempt/. He updates it quite often so check in regularly.

Paul E. McKenney has written an excellent summary on most of the different approaches to real-time adaptations of the Linux kernel. You can find it at Kerneltrap under the title Linux: Realtime Approaches.

Similar Posts:



Leave a Comment