Arnd Bergmann noted that he's working on removing the BKL from the Linux kernel, "I've spent some time continuing the work of the people on Cc and many others to remove the big kernel lock from Linux and I now have [a] bkl-removal branch in my git tree". He went on to explain that his branch is working, and lets him run the Linux kernel, "on [a] quad-core machine with the only users of the BKL being mostly obscure device driver modules." Arnd noted that this effort has a long history, "the oldest patch in this series is roughly eight years old and is Willy's patch to remove the BKL from fs/locks.c, and I took a series of patches from Jan that removes it from most of the VFS."
Arnd noted that his patch adds a global mutex to the TTY layer, which he called the 'Big TTY Mutex' and described as, "the basic idea here is to make recursive locking and the release-on-sleep explicit, so every mutex_lock, wait_event, workqueue_flush and schedule in the TTY layer now explicitly releases the BTM before blocking." Alan Cox suggested that this portion of the patch was best dropped for now, "it would be nice to get the other bits in first removing BKL from most of the kernel and building kernels which are non BKL except for the tty layer. That (after Ingo's box from hell has run it a bit) would reasonably test the assertion that the tty layer has no BKL requirements that are driven by [code] external to tty layer code." Andrew Morton suggested that the patches be pushed upstream to their appropriate maintainers for an additional sanity check, "Seems that there might be a few tricksy bits in here. Please do push at least the non-obvious parts out to the relevant people."
"It's two weeks (and one day), and the merge window is over," began Linus Torvalds, announcing the 2.6.27-rc1 kernel. He continued, "finally. I don't know why, but this one really did feel pretty dang busy. And the size of the -rc1 patch bears that out - at 12MB, it's about 50% bigger than 26-rc1 (but not that much bigger than 24/25-rc1, so it's not like it's anything unheard of)." He reflected, "the pure size of the -rc's _is_ making me a bit nervous, though. Sure, it means that we are good at merging it all, but I have to say that I sometimes wonder if we don't merge too much in one go, and even our current (fairly short) release cycle is actually too big." As for the actual changes, Linus explained:
"Much of -rc1 was in linux-next, but certainly not everything. We'll see how that whole thing ends up evolving - it certainly didn't solve all problems, and there was some bickering about things that weren't there (and some things that mostly were ;), but maybe it helped. There's a ton of new stuff in there, but at least personally the interesting things are the BKL pushdown and perhaps the introduction of the lockless get_user_pages_fast(). The build system also got updated to allow moving the architecture include files ('include/asm-xyz') into the architecture subdirectories ('arch/xyz/include/asm'), and sparc seems to have taken advantage of that already."
Other changes Linus highlighted included merging the UBI filesystem, as well as, "tracing, firmware loading, continued x86 arch merging, and moving more code to generic support (unified generic IPI handling, coherent dma memory allocation, show_mem etc). Bootmem rewrites. [And] some support for further scalability (ie 4k cpu cores)."
"As some of the latency junkies on lkml already know, commit 8e3e076 in v2.6.26-rc2 removed the preemptible BKL feature and made the Big Kernel Lock a spinlock and thus turned it into non-preemptible code again. This commit returned the BKL code to the 2.6.7 state of affairs in essence," began Ingo Molnar. He noted that this had a very negative effect on the real time kernel efforts, adding that Linux creator Linus Torvalds indicated the only acceptable way forward was to completely remove the BKL. Ingo explained:
"This task is not easy at all. 12 years after Linux has been converted to an SMP OS we still have 1300+ legacy BKL using sites. There are 400+ lock_kernel() critical sections and 800+ ioctls. They are spread out across rather difficult areas of often legacy code that few people understand and few people dare to touch. It takes top people like Alan Cox to map the semantics and to remove BKL code, and even for Alan (who is doing this for the TTY code) it is a long and difficult task."
Ingo went on to describe how the BKL works, how it differs from other locking mechanisms, and why this complicates removing it permanently from the kernel. He noted that the various dependencies of the lock are lost in the haze of 15 years of code changes, "all this has built up to a kind of Fear, Uncertainty and Doubt about the BKL: nobody really knows it, nobody really dares to touch it and code can break silently and subtly if BKL locking is wrong." He then suggested "changing the rules of the game", creating a "kill-the-BKL" branch which "turns the BKL into an ordinary albeit somewhat big mutex, with a quirky lock/unlock interface called 'lock_kernel()' and 'unlock_kernel()'."
"About 45% architecture updates (counting the include files too), about 30% drivers, and about 25% odds-and-ends. The odds-and-ends are mainly Documentation, filesystems (mostly cifs) and core kernel (scheduler updates etc)," said Linux creator Linus Torvalds, announcing the 2.6.26-rc2 kernel. He added, "if you read the shortlog and get the feeling that most of it is pretty boring small details, you'd be right. There is little exciting there." He continued:
"A fairly small part of it, but quite possibly the most noticeable one, is how the semaphore changes impacted the BKL (the old 'big kernel lock' that is still used for some legacy code, for you non-core people out there), which in the past had different versions ('regular', 'preemptable'). A few months ago we dropped the regular BKL version, but in 2.6.25-rc1 we then had performance (and then correctness) issues with the interaction between the semaphore implementation and the preemptable BKL, so we're back to the old regular version for now."