"If you're wondering why I'm taking a long time to respond to your patches,", began Theodore Ts'o on the linux-ext4 mailing list, in a thread that offered much insight into how and why to properly submit and test patches. "Patches that are accepted into mainline should do one and only one thing," Ted continued, "so if someone suggests that you make changes to your submitted patch, ideally what you should do is to resubmit the patch with the fixes --- and not submit a patch which is a delta to the previous one." He also noted that patch submitters often greatly outnumber maintainers dictating a higher standard of quality, "consider that for some maintainers, there may be 10 or 20 or 30 or more patch submitters in their subsystem. With that kind of submitter-to-maintainer ratio, the patch submitter simply has to do much more of the work, since otherwise the subsystem maintainer simply can't keep up."
Ted went on to acknowledge, "I happen to believe that we need to encourage newcomers to the kernel developer community, and so I spend more time mentoring people who are new to the process." He noted that his time was finite however, and that patches are accepted more quickly when they are easy to review and integrate. Regarding the filesystem for which the patches had been submitted, he added, "Ext4 is actually quite stable at this point. Very large numbers of people are using it, and most users are quite happy." For this reason, he pointed out that it is even more critical that the patches merged be of high quality. That said, he continued, "there is no such thing as code which is not buggy. For any non-trivial program, it's almost certain there are bugs. [...] Ext4 is not exempt from these fundamental laws of software engineering. 'Code is always buggy until the last user of the program dies'." He tied this back to the importance of testing patches before submitting, "keep in mind that the maxim that code which is not buggy also applies to your patches."
"I do think 'next' as it is has a few issues that either need to be fixed (unlikely - it's not the point of next) or just need to be aired as issues and understood," noted Linus Torvalds about the linux-next development tree, originally designed as a way to get subsystem maintainers more involved in managing merge conflicts. Linus continued, "I don't think anybody wants it to go away. The question in my mind is more along the way of how/whether it should be changed. There was some bickering about patches that weren't there, and some about how _partial_ series were there but then the finishing touches broke things."
He listed his two primary concerns as, "I don't think it does 'quality control', and I think that's pretty fundamental," and, "I don't think the 'next' thing works as well for the occasional developer that just has a few patches pending as it works for subsystem maintainers that are used to it." Linus continued, "I don't think either of the above issues is a 'problem' - I just think they should be acknowledged. I think 'next' is a good way for the big subsystem developers to be able to see problems early, but I really hope that nobody will _ever_ see next as a 'that's the way into Linus' tree', because for the above two reasons I do not think it can really work that way." Andrew Morton noted, "a lot of the bugs which hit your tree would have been quickly found in linux-next too," then added, "but it's all shuffling deckchairs, really. Are we actually merging better code as a reasult of all of this? Are we being more careful and reviewing better and testing better? Don't think so."
For many years, each Linux kernel release was assigned a series of three numbers, X.Y.Z, with an even Y indicating a "stable" release, and an odd Y indicating an "unstable" development release. Z was incremented for each individual kernel release. The "stable" 1.0.0 Linux kernel was released in March of 1994. New development was then continued in the "unstable" 1.1.z branch, until the "stable" 1.2.0 Linux kernel was release in March of 1995. Major improvements in the kernel lead to X being incremented to 2, and a "stable" 2.0 kernel was released in June of 1996. Active development then continued in the "unstable" 2.1 tree. This process continued with "stable" 2.2, 2.4 and 2.6 kernel trees, and each stable tree gained an official maintainer while Linux creator Linus Torvalds focused on newer features in the next "unstable" tree. Development in these "unstable" trees could go on for periods of multiple years before a "stable" tree was branched.
This long-standing odd/even development model was officially scrapped in 2004 thanks to the success that Linus and Andrew Morton were having working together, and significant "unstable" development began happening between each 2.6.Z release. In a recent thread it was asked what it would take for an "unstable" 2.7 development tree to be created, to which Linus noted replied:
"Nothing. I'm not going back to the old model. The new model is so much better that it's not even worth entertaining as a theory to go back. That said, I _am_ considering changing just the numbering. Not to go back to the old model, but because a constantly increasing minor number leads to big numbers. I'm not all that thrilled with '26' as a number: it's hard to remember. [..] I think the time-based releases (ie the '2 weeks of merge window until -rc1, followed by roughly two months of stabilization') has been so successful that I'd prefer to skip the version numbering model too. We don't do releases based on 'features' any more, so why should we do version _numbering_ based on 'features'?"
"Oh great, not yet-another-kernel-tree, just what the world needs..." began Greg KH, continuing, "yes, this is an announcement of a new kernel tree, linux-staging." He explained:
"In a long and meandering thread with some of the other kernel developers a week or so ago, it came up that there is no single place for companies and developers to put their code for testing while it gets cleaned up for submission into the kernel tree. All of the different subsystems have trees, but they generally only want code that is about to go into this release, or the next one. For stuff that is farther off, there is no place to go. So, here's the tree for it."
In a readme created for the new tree, Greg adds, "the linux-staging tree was created to hold drivers and filesystems and other semi-major additions to the Linux kernel that are not ready to be merged at this point in time." He also requested that the new tree be included in Linux -next, leading Theodore Ts'o to ask, "does this mean that the nature of linux-next is changing? I thought the whole point of linux-next was only to have what would be pushed to Linus in the near future, so we could check for patch compatibility issues." Greg explained that he was hoping for an exception for his new -staging tree as it only includes whole new drivers and filesystems, not changes to existng features, "there is stuff that users can use to get hardware to work that currently is not supported on kernel.org kernels at all." As an example he noted, "there are 2 big network drivers in there that support a wide range of devices that some people would like to see working :)"
Tony Luck offered some statistics focused on the frequency of developers that only contribute to the Linux kernel one time, "I skimmed through looking for drive-by contributors (defined as someone who contributes to just one release and is then never heard from again)." Starting with the 2.6.11 kernel, he suggested the following numbers: "63 [developers contributed patches] in version 2.6.11 [and then were] never seen again, 148 in version 2.6.12, 128 in version 2.6.13, 92 in version 2.6.14, 96 in version 2.6.15, 122 in version 2.6.16, 137 in version 2.6.17, 140 in version 2.6.18, 135 in version 2.6.19, 95 in version 2.6.20, 136 in version 2.6.21, 153 in version 2.6.22, 179 in version 2.6.23, 179 in version 2.6.24, and 304 in version 2.6.25".
It was pointed out that the statistics were artificially inflated due to name misspellings and other variations, and that many of the listed people are actually still working with the Linux kernel. Greg KH added, "well, you do know that the distribution of all of our users are: 50% only contributed 1 patch; 25% contributed 2; 12% contributed 3; 6% contributed 4 and so on?" He went on to point out:
"Our curve is leveling out much better now though. For the whole 2.5 release, the top 30 people did over 80% of the work. Now, the top 30 people are doing 30% of the work. So it is getting much better, as long as we still continue to keep our massive rate of change that we have going, and huge number of developers, we should be fine. So this list doesn't necessarily mean anything is wrong, only that 50% are one-time contributors. And I think that shows we are easy to get a change into our tree from just about anyone, not that we are driving people away."
"In the early days, the project was conceived as a way of getting fresh blood into kernel development by giving them fairly simple but generally useful tasks and hoping they'd move more into the mainstream," began James Bottomley starting a thread titled Fixing the Kernel Janitors project. He continued, "if we wind forwards to 2008, there's considerable and rising friction being generated by janitorial patches,", references a recent thread complaining about worthless patches hitting the lkml. Later in the thread he added:
"That's why I think we have to change the process. If we keep the Janitors project, then the bar has to be raised so that it becomes more participatory and thought oriented (i.e. eliminate from the outset anyone who is not going to graduate from mechanical changes to more useful ones). That's why I think bug finding and reporting is a better thing to do. There are mechanical aspects to finding bugs but it is a useful service. Bug fixing is participatory because we usually do quite a lot of back and forth between the reporter and the fixer and at the end of the day quite a few people get curious about how the bug was triggered in the first place and actually go off and read the code."
"Is there a write up of what you consider the 'proper' git workflow?" Theodore Ts'o asked Linux creator Linus Torvalds, "why do you consider rebasing topic branches a bad thing?" Linus replied, "rebasing branches is absolutely not a bad thing for individual developers. But it *is* a bad thing for a subsystem maintainer." He went on to differentiate between 'grunts' who write the code and 'managers' who primarily collect other people's code, "a grunt should use 'git rebase' to keep his own work in line. A technical manager, while he hopefully does some useful work on his own, should strive to make _others_ do as much work as possible, and then 'git rebase' is the wrong thing, because it will always make it harder for the people around you to track your tree and to help you update your tree." Linus compared his own patch management style and productivity from over six years ago before he started using BK and git, to his current style using git:
"You can either try to drink from the firehose and inevitably be bitched about because you're holding something up or not giving something the attention it deserves, or you can try to make sure that you can let others help you. And you'd better select the 'let other people help you', because otherwise you _will_ burn out. It's not a matter of 'if', but of 'when'. [...] And when you're in that kind of ballpark, you should at least think of yourself as being where I was six+ years ago before BK. You should really seriously try to make sure that you are *not* the single point of failure, and you should plan on doing git merges. [...] I think a lot of people are a lot happier with how I can take their work these days than they were six+ years ago."
"This is starting to get beyond frustrating for me," complained David Miller of the latest merge window, launching what turned into a very lengthy and ongoing discussion about the Linux kernel development process. The concept of a regular "merge window" was first discussed in July of 2005 with the release of the 2.6.14-rc4 kernel, following the 2005 Developers' Summit. From 2.6.14 on, the release of each official 2.6.y kernel has been followed by a two week period during which major changes are merged into the kernel, followed by a 2.6.y-rc1 release. David complained that this particular merge window has been more painful than others, "the tree breaks every day, and it's becoming an extremely non-fun environment to work in. We need to slow down the merging, we need to review things more, we need people to test their [...] changes!"
During the lengthy discussion, Linux creator Linus Torvalds explained:
"The notion that we should even _try_ to aim to slow things down, that one I find unlikely to be true, and I don't even understand why anybody would find it a logical goal? Of course, you will have fewer new bugs if you have fewer changes. But that's not a goal, that's a tautology and totally uninteresting. A small program is likely to have fewer bugs, but that doesn't make something small 'better' than something large that does more. Similarly, a stagnant development community will introduce new bugs more seldom. But does that make a stagnant one better than a vibrant one? Hell no. So what I'm arguing against here is not that we should aim for worse quality, but I'm arguing against the false dichotomy of believing that quality is incompatible with lots of change."
"We've got ourselves a developing bureaucracy. As in 'more and more ways of generating activity without doing anything even remotely useful'. Complete with tendency to operate in the ways that make sense only to bureaucracy in question and an ever-growing set of bylaws..."
Discussing the latest breakage of the linux-next tree, Stephen Rothwell noted that the problem went unnoticed due to the arm tree not currently being included, "this is why I would have liked you to participate in the linux-next tree ...". Arm maintainer Russell King questioned the usefulness, saying, "linux-next will not give me anything which -mm isn't giving me. As I said in the discussion, linux-next value is _very_ small for me. Sorry but true." Several stepped in to offer some reasons that the linux-next tree is useful.
Andrew Morton noted, "putting arm into linux-next means that Stephen (and git) handle the merges rather than having me (and not-git) do it. Which helps me. I expect that linux-next will get a lot more cross-compilation testing than -mm. Which helps you." Greg KH added, "getting your stuff into linux-next would provide a public place for others to base off of, making it easier for them to send patches to you ensuring that they apply properly. Which in the end, will help others be able to contribute easier, and help you by getting patches you do not need to rebase yourself." Stephen Rothwell summarized the advantages for a maintainer:
"5 times a week your tree gets merged with lots of other code destined for Linus' next release. From this you get to find out about things in other trees that clash with yours. This tree gets built on several architectures for several configs (including arm). So you find out if other trees will break yours. I am happy to build more (basically all) the arm configs as I have offered before."
A thread on the Linux Kernel mailing list discussed the process in place for reporting, bisecting and fixing bugs. In response to a suggestion that some of the issues could be solved by introducing new procedures, Al Viro retorted, "we've got ourselves a developing beaurocracy. As in 'more and more ways of generating activity without doing anything even remotely useful'. Complete with tendency to operate in the ways that make sense only to bureaucracy in question and an ever-growing set of bylaws..." Later in the thread, David Miller agreed and noted that ,"the resulting 'bureaucracy' or whatever you want to call it is perceived to undercut the very thing that makes the Linux kernel fun to work on. It's still largely free form, loose, and flexible. And that's a notable accomplishment considering how much things have changed. That feeling is why I got involved in the first place, and I know it's what gets other new people in and addicted too."
Andrew Morton tried to return the discussion to its original topic, "the problem we're discussing here is the apparently-large number of bugs which are in the kernel, the apparently-large number of new bugs which we're adding to the kernel, and our apparent tardiness in addressing them." Al noted that some of the problem is that git is so efficient at merging code, "the patches going in during a merge (especially for a tree that collects from secondaries) are not visible enough. And it's too late at that point, since one has to do something monumentally ugly to get Linus revert a large merge. On the scale of Great IDE Mess in 2.5..." Another suggestion was made to replace bugzilla with something better, to which Andrew replied, "swapping out bugzilla for something else wouldn't help. We'd end up with lots of people ignoring a good bug tracking system just like they were ignoring a bad one."
"We've always had some pending/unresolved issues, and I think that as our tracking gets better, there's likely to be more of them. A number of bug-reports are either hard to reproduce (often including from the reporter) or end up without updates etc.
"It really helps if submitters tell us that a patch fixes another pending patch, and which one that is. Usually I have to ask if I can't work it out. But if a) we weren't told that and b) I have no reason to think it's not a mainline problem and c) the patch applies to mainline and d) the patch affects an architecture which I'm not cross-compiling for, it's going to sneak through.
"No cute April 1st shenanigans, just a regular -rc release that happened to come up today because I was waiting for the input layer oops-fixes to be ready and tested," began Linus Torvalds, announcing the 2.6.25-rc8 kernel on April 1st. He continued, "the bulk of the fixes are the usual random one-liners. [...] A lot of the one-liners are some sparse cleanups, which is probably unnecessary noise at this point, but when Al sends me a series I just tend to apply it because his patches tend to be rather careful and basically always correct." Linus added:
"The big thing that is actually *noticeable* to most people is that this should fix the two top regressions: we've had some suspend-resume regressions due to the stupid ACPI _PTS ordering issues, and while the cleanups were left, the ordering changes were reverted. So that should fix issues for some people (of course, the people who had it fixed are unhappy, but regressions are worse). The other thing that bit a number of people and is now fixed (and that also probably often showed up as a suspend/resume regression) was some 'struct device' lifetime changes that broke the input layer. Thanks to people who debugged that one."
"Now that we are (presumably) approaching the next merge window, can I ask what use (if any) will you be making of the linux-next tree? Alternatively, is there any information you want from it?" Stephen Rothwell asked regarding the tree he started maintaining last month for tracking upcoming stable merges.
Andrew Morton replied, "afacit it's already working. The level of merge and build errors in the subsystem trees this time around is a tiny fraction of what it was at the same stage in 2.6.24-rcX." He went on to note, "there are 60 or 80 "susbsytem" trees hosted in -mm at present," adding:
"I need to find a way to a) get matureish parts of those trees into linux-next and to b) base the rest of -mm off linux-next. I haven't started thinking about that yet. There seem to be some trees which aren't yet in linux-next, some of them significant."