His renowned Kernel Report has been presented to audiences worldwide, and this year will mark his fourth appearance at Linux.conf.au, in Melbourne, Australia.
Here, Corbet offers Computerworld readers a sneak peek at the major themes behind this year's Kernel Report.
What is the main theme of your talk at Melbourne's Linux Conference?
The real purpose of my talk is to bring attendees up-to-date with regard to what is happening in the kernel development community. It is a fast-moving project which is very hard to keep up with - the linux-kernel mailing list, alone, can run up to 500 highly technical messages per day. I do follow this community, though, and have gotten reasonably good at summarising what is happening there - and making hand-waving predictions about what will be happening in the near future.
How has the kernel development process gone over the past year? What were the major happenings?
The process is running quite smoothly, with four major kernel releases (2.6.20 through .23) being made, and 2.6.24 being brought close to release. I anticipate it will come out just before linux.conf.au begins.
I am confident that, five years from now, we will say that we were able to accept unprecedented amounts of new code at a sustained rate for years while improving the quality of the final product.
Major events over the last year include the incorporation of several virtualisation implementations (KVM, Lguest, Xen), the dynamic tick patches, a completely new wireless networking stack, the CFS process scheduler, and much, much more.
Is the patch flow rate continuing at a high rate?
It is, in fact, increasing, with the 2.6.24 kernel having the highest patch rate yet. Almost 10,000 separate changesets were merged in this development cycle, resulting in the addition of about 300,000 lines of code - and the modification of many more. One big change (the merger of the i386 and x86-64 architecture code) accounted for a lot of patches, but it was still a tiny part of the whole.
Has the number of developers contributing increased, and has the breakdown of who they work for changed much since last year's Kernel report?
The number of developers is approximately the same - over the course of one year, about 2,000 individual developers will contribute at least one patch to the kernel. I have an unquantified sense that more of these developers are being more active, though.
Once upon a time, the top 20 developers were responsible for a large majority of the code going into any given kernel release. Now they barely do 20%. When the kernel summit program committee tried to identify the 70 or so most important developers, we had a very hard time narrowing down the list. The development community is quite broad, and is becoming more so.
Are there any new big companies paying for development?
The list of companies has stayed relatively static, though it does get shuffled a bit from one release to the next, depending on what gets merged into the kernel in that cycle. One recent addition is Movial, which has hired a very active kernel developer and immediately found its way onto the top-20 list.
Which companies would you like to see participating or participating more?
I'm not going to name specific companies. But the embedded Linux area as a whole continues to suffer from a relatively low level of involvement in the development process. There are a number of reasons for that, and it is not that hard to understand why managers in those companies might conclude that community participation makes little business sense. But the long-term result is that the interests and needs of those companies have relatively little influence in how the kernel is developed.
By the time a feature makes it into the mainline, it has already seen a significant amount of testing - enough to be usable by the more adventurous distributors.
So many of us have been trying to carry a message to embedded Linux companies for a while now: if you do not want kernel development to favour large, enterprise deployments over your own needs, you need to join the party and help ensure that next year's kernel will be suitable for next year's products. In the process, you'll get better code and the many benefits that result from letting the community help you make your products better.
There are signs that some companies are beginning to hear that message, but we have some ground to cover yet. Your notes on the previous Kernel report read: "Some fear that kernel quality is declining: Bugs not getting fixed. Too many features added too quickly. Too little stabilisation time. Kernel developers tend not to agree, But everybody agrees fewer bugs would be better."
How has this situation changed this time around?
In a sense, the situation has not changed at all. There have been no major development process adjustments aimed at reducing bug counts, and the flow of features into the kernel continues unabated. Some developers still worry that the kernel is slowly deteriorating, and that we will wake up one day and find that things have gone too far, that our kernel is unreliable, and that our reputation for quality is gone.
Behind the scenes, though, quite a bit is happening. There is an increasingly fierce focus on preventing regressions and fixing them when they do happen, even if that means backing out features that others want. The reasoning goes like this: if everything which worked before continues to work in the future, it is hard to argue that the quality of the kernel is declining. But if things are allowed to break, then nobody knows for sure.
In support of this effort we have people tracking regressions, working on tools to find (or prevent) bugs, writing test suites, and more. There is also an increased focus on ensuring that patches are properly reviewed before being merged into the mainline.
A lot is happening to make the kernel better. I am confident that, five years from now, we will say that we were able to accept unprecedented amounts of new code at a sustained rate for years while improving the quality of the final product.
Do you still think that Linux distributions should be deciding on when a feature is stable enough to be included into one of their release kernels?
Of course - why would a distributor ship code which it is not confident of being able to support? Perhaps your question is: is there a problem with features going into the kernel which are not sufficiently stable for distributors to enable?
My belief is that this is not happening. By the time a feature makes it into the mainline, it has already seen a significant amount of testing - enough to be usable by the more adventurous distributors. By the time the enterprise distributions get around to shipping a given kernel, the final problems will generally have been worked out of it.
When getting an order in a few milliseconds late results in the loss of real money, people pay a lot of attention to response times.
To my knowledge, there have been very few features merged into the mainline which are then disabled by distributors shipping a given release. This is especially true of core features - device drivers will always be a little more variable in their readiness. What happens instead is that distributors will ship code which has not yet made it into the mainline - the realtime patch sets are a classic example here. So I do not think that distributors are being asked to pick and choose between questionable features in the mainline kernel.
What are some of the most important works going on now in the kernel?
After a long period of relative quiet, a lot is happening in the file systems area. The ever-increasing size of storage devices is putting some real stresses on current file systems, and the lead time for new file systems can be quite long. File system developers tend to be very conservative folks. The consequences of file system bugs tend to be particularly unpleasant. So the file systems we'll be using five years from now need to be under development now. The good news is that there are some very interesting projects in this area, a number of which will be represented at linux.conf.au.
The realtime patch set is another interesting area. Realtime performance is useful in a number of surprising places. Banks need it, for example, to be able to guarantee response times to trading requests. When getting an order in a few milliseconds late results in the loss of real money, people pay a lot of attention to response times. Much of the realtime work has found its way into the mainline over the last couple of years, but there is still a lot waiting to be merged. But distributors are shipping it now, and I expect we'll see a lot of it heading toward Linus over the coming year.
Finally, improving hardware support is always an important area of work. Over the course of the next year, though, we will see free drivers for most wireless networking chips and most video adapters, which are traditionally the areas with the most problems. This is happening as a result of extensive reverse-engineering effort and a change of mindset at certain vendors. By the end of 2008, I think, most of the hardware hassles will be behind us.
Do you think the current scheduler is doing OK, or is there still room for improvement?
Sun does not appear to have any desire to license those patents for the Linux kernel. So there will not be ZFS in Linux in the foreseeable future.
There's always room for improvement. But I have to say that the complete replacement of the process scheduler appears to have gone quite smoothly. There have been very few complaints so far. That may change as the CFS scheduler makes its way out to more users (most non-developers won't be running it yet), but the fundamental structure appears to be quite sound.
Can you see a tighter integration with ZFS happening now that the flame wars have settled down a little?
No. The licensing for ZFS is not compatible with GPLv2, so that code can never make it into the mainline kernel. There are also software patents involved, and Sun does not appear to have any desire to license those patents for the Linux kernel. So there will not be ZFS in Linux in the foreseeable future.
What distribution do you use on your own machines, and do you use the distribution supplied kernel on them or not?
I run a different distribution on every machine I have, just as a way of seeing what everybody is up to. I also try not to endorse specific distributions. I will say, though, that the desktop that I'm typing on at the moment is running Rawhide - the Fedora development repository.
The laptop I'll take to linux.conf.au has Ubuntu Gutsy, though I may jump onto the Hardy development repository before I travel. Development distributions keep me on the front line of the development process, which is a fun (and sometimes terrifying) place to be.
Once upon a time, I built custom kernels for every machine I deployed; back in the early 1.x days it was almost mandatory. Now I stick with stock kernels on most machines. My desktop is always running the current development kernel, though, so that I can be part of the testing community and make changes of my own. I have a strong belief that, while any testing is good, the best testing is using the software to get real work done. There's a lot of problems which just don't come up in any other situation.
Do people want more Dtrace or have they accepted SystemTap is the way to go?
What people want is a rock-solid tracing facility that is usable by operations staff. Dtrace has a lot of the needed features, so people ask for it. I believe that, in many ways, SystemTap is an even more powerful mechanism than Dtrace, but it is currently rough around the edges and difficult to use. So, in a very real sense, SystemTap does not, at this time, satisfy the needs being expressed by a wide range of users.
The good news is that work is being done to make SystemTap better. As we used to say in engineering school: the first 90% is done, now the developers just have to do the other 90% of making it usable. To that end, features like static markers have recently gone into the mainline kernel, making it possible to create a standard set of tracepoints that anybody can use without having to actually know the kernel code.
A year from now, SystemTap (and the other tracing packages currently under development) should be more than good enough. I hope. But we're not there now.