Better Scalability Central to New Linux Kernel

Improved I/O, scalability in SMP environments, due this summer

By all accounts, the upcoming version 2.6 release of the Linux kernel could do much to enhance the scalability of the open source operating system in large symmetric multiprocessing (SMP) environments.

In addition, the Linux 2.6 kernel will pack a variety of different I/O enhancements that should make Linux more suitable for transaction-oriented and other I/O-intensive applications.

The upshot, says Jonathan Eunice, principal and IT advisor with consultancy Illuminata, is that version 2.6 of the Linux kernel compares favorably with the maturity of other major Unix variants at this stage of development. “The latest 2.6 compares nicely with later operating systems that I’ve seen. It just beats the hell out of the original Unix, just in terms of how well commented and how structured [it is]. [For example,] they’re using structured mechanisms for dealing with locks and for concurrency control. It’s very clean code.”

For scalability in large SMP systems, the development branch of the Linux kernel—version 2.5.x—currently includes an order one scheduler (O(1) scheduler) that enhances Linux’ scalability and overall performance by improving throughput. Odd-numbered point releases (e.g., v2.5.x) of the Linux kernel are developer-only releases that are typically unsupported by Linux vendors. Even-numbered point releases (e.g., v2.6.x) are stable versions that are supported by Linux vendors.

According to Nick Bowen, vice president of pSeries and xSeries software development with the IBM Systems Group (ISG), the O(1) scheduler will officially debut in the Linux 2.6 kernel and will allow for multiple priority queues for each CPU, load-balancing across CPUs and NUMA-aware load-balancing, including intranode and internode balancing.

In addition to the O(1) scheduler, Bowen says, the Linux 2.6 kernel will include enhancements that enable larger virtual memory support—up to 32 GB—and facilitate more scalable page-handling support, among other features. “Typically, most operating systems support four KB pages, which tends to be one of the scalability inhibitors, because as you use more and more CPUs, you drive the requirement for more and more memory, and as you drive the usage of more and more memory, [you drive the requirement for] things like page fault handling,” Bowen explains. “When you do that on the granularity of 4 KB, you do that a lot, but if you have 2 MB page support, that will dramatically decrease the times that the kernel has to get involved to do it.”

Jason Pettit, a product line manager with Silicon Graphics Inc. (SGI), says another new enhancement expected in the Linux 2.6 kernel—but present now in version 2.5—is kernel lock support. “That really helps with the ability to scale a system, to be able to take advantage of all of the processors that it has by not allowing one processor to take over a single processor.”

The 2.6 release of the Linux kernel could include support for a new threading model, dubbed the Native POSIX Thread Library (NPTL) for Linux, which supplants the old Linux Threads library. NPTL is expected to deliver a genuine performance boost in SMP environments, and has been accepted into the version 2.5 development branch of the Linux kernel. Moreover, IBM, which had previously announced plans to work on another POSIX-based threading model, has announced support for NPTL.

There’s still some debate about the native scalability of Linux on SMP systems. IBM’s Bowen, for example, says Linux scales very well across two-, four- and eight-way hardware, and that with the Linux 2.6 kernel, Big Blue expects to achieve good linear scalability on 16-way xSeries Intel-based and pSeries Power-based hardware. “We have a very specific project that we’re working on with the goal of in the 16-way space, 75 percent scalability.”

To do that, Bowen explains, IBM is “doing what OS people do”—tuning code and measuring performance to ferret out bottlenecks in the code.

As far as SGI’s Pettit is concerned, version 2.4.x of the Linux kernel—with certain enhancements—is good enough for his company’s 64-processor Itanium-based Altix 3000 systems. Pettit says SGI currently supports a version of Linux—pegged to binary compatibility with code from Linux specialist Red Hat Inc.—that is based on version 2.4.19 of the Linux kernel. To address some of the 2.4.x kernel’s shortcomings, Pettit says that SGI backported development code from 2.5.x to support an O(1) scheduler. “The old story that Linux didn’t scale was much more closely linked to the [32-bit Intel-based] platforms that it had been running on. We run our current 2.4.19 kernel on a 64 processor system, and we’ve had very nice system utilization and processor scaling with that.”

SGI’s Altix system is intended primarily for high-performance computing (HPC) or scientific computing roles. The requirements of applications in this space—which are typically easily distributable among nodes or logical partitions in a cluster—differ from those of many business environments, however. Illuminata’s Eunice says that this could help to account for the different perspectives from both Big Blue and SGI.

More importantly, adds Eunice, arguments of this kind are beside the point. The sweet spot for the current version of the Linux kernel—2.4.x—is in the one-, two- and four-way spaces. That’s where most customers are deploying Linux-based systems, even on large, partitioned SMP systems. “I feel totally comfortable with the 2.4 release going on a two-way or four-way server. I think that 2.6 will go on eight-way and above, but I think that the majority of users are still going to be using one-way or two-way servers, which are still going to be the most cost efficient way to run a lot of different workloads.”

SMP scalability aside, version 2.6 of the Linux kernel should include a number of significant I/O enhancements as well, particularly in the realm of SCSI performance. Linux 2.6 will implement major changes to both the block I/O layer and SCSI layer in terms of improved performance, scalability, and error recovery. Some enhancements include the removal of io_request_lock and replacement of a per-host instance lock; the reduction of bounce buffers for high memory systems; per-CPU SCSI I/O completions; and changes to SCSI resource allocations to allow for dynamic additions and removals.

The result, suggests SGI’s Pettit, is that the Linux 2.6 kernel should boast drastically improved SCSI performance—especially for I/O intensive applications. “The current 2.4 kernel SCSI layer does not perform at a level that is really appropriate for high-performance computing, and the work that has been done in the community to improve that SCSI layer in 2.6 looks exciting.”

Exciting, no doubt—but when will the 2.6 kernel officially be released? All signs point to sometime in June, largely because Linux’ founder and project leader Linus Torvalds disclosed in November 2002 that the 2.6 release of the Linux kernel would appear by the end of Q2 2003. This doesn’t mean that vendors such as IBM and SGI will begin supporting it at that time in their systems, however. Says Pettit: “We’re looking forward to 2.6, but we are going to track what the Linux distributors do, and that’s very hard to predict. Typically, there’s a good bit of stabilization that goes on [after a major new release of the Linux kernel], and we will track what the distributors do.”

About the Author

Stephen Swoyer is a Nashville, TN-based freelance journalist who writes about technology.