"After posting some benchmarks involving cfs, I got some feedback, so I decided to do a follow-up that'll hopefully fill in the gaps many people wanted to see filled," Rob Hussey began. He added, "this time around I've done the benchmarks against 2.6.21, 2.6.22-ck1, and 2.6.23-rc6-cfs-devel (latest git as of 12 hours ago)." Rob briefly summarized, "the only analysis I'll offer is that both sd and cfs are improvements, and I'm glad that there is a lot of work being done in this area of linux development. Much respect to Con Kolivas, Ingo Molnar, and Roman Zippel, as well all the others who have contributed."
Referring to a chart in which the blue line represented the CFS process scheduler, and the green line represented the SD "staircase" process scheduler, Ingo Molnar noted, "heh - am i the only one impressed by the consistency of the blue line in this graph? :-) [ and the green line looks a bit like a .. staircase? ]" He acknowledged some slowdown in CFS compared to SD in one of the benchmarks, "-ck1 is 0.8% faster in this particular test." Ingo then explained, "many things happened between 2.6.22-ck1 and 2.6.23-cfs-devel that could affect performance of this test. My initial guess would be sched_clock() overhead." In further testing he applied a low-res-sched-clock that resulted in better performance for CFS leading him to conclude, "the performance difference between -ck and -cfs-devel seems to be mostly down to the more precise (but slower) sched_clock() introduced in v2.6.23 and to the startup penalty of freshly created tasks." When asked if the low-res-sched-clock was likely to be merged, Ingo replied:
"I don't think so - we want precise/accurate scheduling before performance. (otherwise tasks working off the timer tick could steal away cycles without being accounted for them fairly, and could starve out all other tasks.) Unless the difference was really huge in real life - but it isn't."
"In the patch you really remove _a_lot_ of stuff," commented Roman Zippel in his reaction to Ingo Molnar's latest updates to the Completely Fair Scheduler. Roman has been consistently critical of Ingo's efforts, asking questions and criticizing Ingo's feedback. He continued, "you also removed a lot of things I tried to get you to explain them to me. On the one hand I could be happy that these things are gone, as they were the major road block to splitting up my own patch. On the other hand it still leaves me somewhat unsatisfied, as I still don't know what that stuff was good for."
Ingo replied to Roman's technical concerns, and pointed out that he'd been traveling for the recent kernel summit, adding, "I bent backwards trying to somehow get you to cooperate with us (and I still haven't given up on that!) - instead of you disparaging CFS and me frequently :-(". Willy Tarreau took a more critical stance, calling into question Roman's motives. He noted that he had been impressed by Roman's original review of the scheduler, but disappointed as the discussion seemed to degenerate, "it's the way you're trying to prove Ingo is a bastard and that you're a victim. But if we just re-read a few pick-ups of your mails since Aug 1st, its getting pretty obvious that you completely made up this situation." Kyle Moffett added, "I get the impression that Ingo re-implemented some ideas that you had because you refused to do so in a way that was acceptable for the upstream kernel. How exactly is this a bad thing?"
Having recently returned from the Linux kernel summit, Ingo Molnar and Peter Zijlstra sent out some performance updates to the Completely Fair Scheduler:
"Our main focus has been on simplifications and performance - and as part of that we've also picked up some ideas from Roman Zippel's 'Really Fair Scheduler' patch as well and integrated them into CFS. We'd like to ask people go give these patches a good workout, especially with an eye on any interactivity regressions."
He noted that some of the changes included removing features that had proved unecessary. "while keeping the things that worked out fine, like sleeper fairness." Ingo posted some results from the lmbench benchmark noting around a 16% speedup on both the 32-bit and 64-bit x86 architectures. He added, "we are now a bit faster than the O(1) scheduler was under v2.6.22 - even on 32-bit. The main speedup comes from the avoidance of divisions (or shifts) in the wakeup and context-switch fastpaths."
In an effort to fully understand the math proposed by Roman Zippel in his Really Fair Scheduler, Ingo Molnar implemented a simplified version of the logic on top of his Completely Fair Scheduler code which he then humorously labeled the Really Simple Really Fair Scheduler, "could you please confirm whether the math algorithm you are suggesting is implemented by this patch roughly correctly?" Ingo explained:
"As an addendum to my review, please find below a prototype patch I've just written that implements RSRFS (Really Simple Really Fair Scheduler) on top of CFS. It is intended to demonstrate the essence of the math you have presented via your patch. (it has no nice levels support yet, to make the math really apparent to everyone interested)"
Roman pointed out that things were oversimplified, "it simplifies the math too much, the nice level weighting is an essential part of the math and without it one can't really understand the problem I'm trying to solve." Ingo acknowledged this and explained, "in the first level review I'm mainly interested in your patch's behavior [with regards to] simple nice-0 tasks, how they run and how they sleep and wake up. Nothing else. Why? That's what 99% of the Linux tasks do after all." In another thread he explained that he would continue to review Roman's proposal, however also noting that he would be offline this week for the kernel summit and thus temporarily unresponsive to email.
Ingo Molnar reviewed Roman Zippel's Really Fair Scheduler code, suggesting that much of the work was similar to that which was being done by Peter Zijlstra, "all in one, we don't disagree, this is an incremental improvement we are thinking about for 2.6.24. We do disagree with this being positioned as something fundamentally different though - it's just the same thing mathematically, expressed without a "/weight" divisor, resulting in no change in scheduling behavior. (except for a small shift of CPU utilization for a synthetic corner-case)"
Roman was not impressed with Ingo's review, asking, "did you even try to understand what I wrote?" He continued, "while Peter's patches are interesting, they are only a small step to what I'm trying to achieve." Regarding performance and code-quality concerns with his patch he added, "I explicitly said that my patch is only a prototype, so I haven't done any cleanups and tuning in this direction yet, so making any conclusions based on code size comparisons is quite ridiculous at this point. The whole point of this patch was to demonstrate the algorithmic changes, not to demonstrate a final and perfectly tuned scheduler."
During the many threads discussing Ingo Molnar's recently merged Completely Fair Scheduler, Roman Zippel has repeatedly questioned the complexity of the new process scheduler. In a recent posting to the Linux Kernel mailing list he offered a simpler scheduler named the 'Really Fair Scheduler' saying, "as I already tried to explain previously CFS has a considerable algorithmic and computational complexity. This patch should now make it clearer, why I could so easily skip over Ingo's long explanation of all the tricks CFS uses to keep the computational overhead low - I simply don't need them." He offered a mathematical overview of how his new scheduler works, included some benchmarks, and reflected back to earlier discussions on the lkml asking, "Ingo, from this point now I need your help, you have to explain to me, what is missing now relative to CFS. I tried to ask questions, but that wasn't very successful..." He also noted:
"The basic idea of this scheduler is somewhat different than CFS. Where the old scheduler maintains fixed time slices, CFS still maintains a dynamic per task time slice. This model does away with it completely, instead it puts the task on a virtual (normalized) time line, where only the relative distance between any two task is relevant."