login
Header Space

 
 

kdb

x86 Architecture Merges in 2.6.25

February 1, 2008 - 5:36pm
Submitted by Jeremy on February 1, 2008 - 5:36pm.
Linux news

Ingo Molnar summarized his pull request for changes to the x86 architecture bound for mainline inclusion in 2.6.25 noting, "it's not a small merge, it consists of 908 commits from 96 individual arch/x86 developers (!)". He continued, "a number of core files are changed as well: most notably percpu, debugging details, timers, the firewire remote debugging patch and ... the KGDB remote debugging stub in kernel/kgdb.c." He went on to detail the extent of the testing this tree has received, "in the past few weeks tens of thousands of random x86.git bzImages were successfully built and booted on a number of (commodity) 32-bit and
64-bit testsystems - and there has been a fair amount of test exposure on -mm as well.
" Regarding the remote kernel debugger, Ingo explained:

"We tested KGDB to be merge-worthy within the x86 architecture (the only supported architecture for now) and it's better to have kernel/kgdb.c than arch/x86/kernel/kgdb.c. The code is reasonably clean and the user-space exposure is small - the only real exposure is the decades-old remote GDB protocol. We are happy to fix up any further cleanliness comments that people might have - but we really wanted to start somewhere and get this thing moving. As an added bonus: finally a kernel debugger that can be read without puking too much ;-) [anyone remember KDB?]"

RAS Infrastructure

September 18, 2007 - 9:12pm
Submitted by Jeremy on September 18, 2007 - 9:12pm.
Linux news

"There is a tension here between generality of support infrastructure, maintainability of the infrastructure, simplicity of the infrastructure and reliability of the infrastructure," began Eric Biederman, discussing the need for a common RAS infrastructure for dealing with kernel crashes and what would be involved in getting such tools merged into the mainline kernel. He continued, "the historical linux perspective is that anything that compromises the maintainability or the reliability of the kernel without the tools is unacceptable. There is also a historical perspective that using the single stepping mode of a debugger to diagnose problems frequently leads to symptoms being fixed and not the actual problems being fixed."

Eric compared the kexec on panic code and the kdb code, "on the kexec on panic path the philosophy is that the kernel is broken and as little as possible should be relied upon." He contrasted this to kdb, "from what I can tell the philosophy of the kdb code is that the kernel is mostly ok except for one or two little bugs so it is reasonable to rely on lots of kernel infrastructure." He then suggested that it was because of this difference and reduced maintenance overhead that kexec on panic was merged into the mainline kernel, "I will note that in some sense it is a harder approach to implement as it emphasizes the challenge of drivers that work starting from a random hardware state, and because it draws a clear line between the broken kernel and the recover kernel. But those things are exactly what encourage things to work well." As for what is the next step forward in RAS development, Eric noted, "if someone who is suggesting an implementation can absorb and understand the requirements of the different groups and come up with solutions that meet the requirements of the different projects I think progress can be made. That as far as I know takes talent."

Linux: Reliability, Availability. and Serviceability

August 3, 2007 - 2:49pm
Submitted by Jeremy on August 3, 2007 - 2:49pm.
Linux news

A recent patch posted to the lkml aimed to make it possible to use both kdb and kdump at the same time, and instead led to an interesting discussion about RAS (Reliability, Availability, and Serviceability) tools. Vivek Goyal compared the two main philosophies, "so basically there are two kind of users. One who believes that despite the kernel [having] crashed something meaningful can be done," versus, "exec on panic, which thinks that once [the] kernel is crashed nothing meaningful can be done". When the discussion focused on kdb, Keith Owens noted:

"The problem above applies to all the RAS tools, not just kdb. My stance is that _all_ the RAS tools (kdb, kgdb, nlkd, netdump, lkcd, crash, kdump etc.) should be using a common interface that safely puts the entire system in a stopped state and saves the state of each cpu. Then each tool can do what it likes, instead of every RAS tool doing its own thing and they all conflict with each other, which is why this thread started."

Andrew Morton summarized the current state of affairs, "lots of different groups, little commonality in their desired funtionality, little interest in sharing infrastructure or concepts." In response to an earlier patch Keith posted to a lesser-trafficked mailing list, Andrew suggested it be resubmitted in a working form for a full review, "much of the onus is upon the various RAS tool developers to demonstrate why it is unsuitable for their use and, hopefully, to explain how it can be fixed for them."

Linux: kdb vs. kgdb

March 30, 2002 - 2:47pm
Submitted by Jeremy on March 30, 2002 - 2:47pm.
Linux news

Jeremy Jackson asked "which kernel debugger is 'best'?" on the Linux Kernel Mailing List.

speck-geostationary