|FROM ||From: "Inker, Evan"
|SUBJECT ||Subject: [hangout] Linux v2.6 scales the enterprise
|From owner-hangout-desteny-at-mrbrklyn.com Thu Feb 5 10:23:43 2004
Received: from www2.mrbrklyn.com (localhost [127.0.0.1])
by mrbrklyn.com (8.12.3/8.11.2/SuSE Linux 8.11.1-0.5) with ESMTP id i15FNh3r021868
for ; Thu, 5 Feb 2004 10:23:43 -0500
Received: (from mdom-at-localhost)
by www2.mrbrklyn.com (8.12.3/8.12.3/Submit) id i15FNgnw021867
for hangout-desteny; Thu, 5 Feb 2004 10:23:42 -0500
X-Authentication-Warning: www2.mrbrklyn.com: mdom set sender to owner-hangout-at-www2.mrbrklyn.com using -f
Received: from mail9.messagelabs.com (mail9.messagelabs.com [220.127.116.11])
by mrbrklyn.com (8.12.3/8.11.2/SuSE Linux 8.11.1-0.5) with SMTP id i15FNf3r021862
for ; Thu, 5 Feb 2004 10:23:42 -0500
X-StarScan-Version: 5.1.15; banners=-,-,-
Received: (qmail 32725 invoked from network); 5 Feb 2004 15:33:54 -0000
Received: from unknown (HELO w2gw-ldn02.gam.com) (18.104.22.168)
by server-3.tower-9.messagelabs.com with SMTP; 5 Feb 2004 15:33:54 -0000
Received: from ntas-ldn15.gam.com (unverified) by w2gw-ldn02.gam.com
(Content Technologies SMTPRS 4.3.10) with ESMTP id
for ; Thu,
5 Feb 2004 15:27:15 +0000
Received: by ntas-ldn15.gam.com with Internet Mail Service (5.5.2653.19) id
; Thu, 5 Feb 2004 15:27:15 -0000
From: "Inker, Evan"
Subject: [hangout] Linux v2.6 scales the enterprise
Date: Thu, 5 Feb 2004 15:22:11 -0000
X-Mailer: Internet Mail Service (5.5.2653.19)
Reply-To: "Inker, Evan"
List: New Yorker GNU Linux Scene
Admin: To unsubscribe send unsubscribe name-at-domian.com in the body to hangout-request-at-www2.mrbrklyn.com
Linux v2.6 scales the enterprise
Bigger, stronger kernel sizzles in our performance tests
By Paul Venezia January 30, 2004
If commercial Unix vendors weren't already worried about Linux, they should
be now. Linux has seen wide deployment in datacenters, generally as a Web
server or a file server, or to handle network tasks such as DNS and DHCP,
but not as a platform for running mission-critical enterprise applications.
Solaris, AIX, or HP/UX typically get the nod when an application demands the
highest levels of performance and scalability. The recent release of a new
Linux kernel, v2.6, promises to change that.
The v2.6 kernel ushers in a new era of support for big iron with big
workloads, opening the door for Linux to handle the most demanding tasks
that are currently handled by Solaris, AIX, or HP/UX. The new kernel not
only supports greater amounts of RAM and a higher processor count, but the
core of device management has changed. Previous to this kernel there were
limits within the kernel that could constrain large systems, such as a
65,536 process limit before rollover, and 256 devices per chain. The v2.6
kernel moves well beyond these limitations, and it includes support for some
of the largest server architectures around.
Will the new Linux really perform in the same league as the big boys? To
find out, I put the v2.6.0 kernel through several real-world performance
tests, comparing its file server, database server, and Web server
performance with a recent v2.4 series kernel, v2.4.23.
Linux Meets Big Iron
A primary focus of the v2.6 kernel is large server architectures. Support
for up to 64GB RAM in paged mode, the ability to address file systems larger
than 2TB, and support for 64 CPUs in x86-based SMP systems brings this
kernel and Linux into the more rarified air of truly mission-critical
systems. The included support for NUMA (Non-Uniform Memory Access) systems;
a next-generation SMP architecture; and PAE (Physical Address Extensions),
providing support for up to 64GB of RAM on 32-bit systems, is also new.
There is much more to v2.6 than just bigger numbers in processor and RAM
counts, however. This kernel breaks apart some of the artificial limitations
that have been present in Linux from the beginning, such as the number of
addressable devices and total available PIDs (Processor Identifiers). The
v2.4 kernel supported 255 major devices with 255 minor numbers. (For
example, a volume on a SCSI disk located at /dev/sda3 has a major number of
8, since it's a SCSI device, and a minor number of 3.) On servers with a
large number of real or virtual devices, device allocation can become
problematic. The v2.6 kernel addresses these issues in a big way, moving to
4,096 major devices with more than one million subdevices per major device.
For most users, these numbers are well beyond practical limits, but for
enterprise systems with a need to address many devices, it's a major step.
Also new in v2.6 is NPTL (Native POSIX Threading Library) in lieu of v2.4's
LinuxThreads. NPTL brings enterprise-class threading support to Linux, far
surpassing the performance offered by LinuxThreads. As of October 2003, NPTL
support was merged into the GNU C library, glibc, and Red Hat first
implemented NPTL within Red Hat Linux 9 using a customized v2.4 kernel.
Other goodies in the v2.6 kernel include integrated IPSec support, with the
inclusion of the Kame Project; enhanced support for network file systems,
including support for mounting Novell NetWare shares; initial NFSv4 (Network
File System Version 4) support; and performance and compatibility
enhancements with SMB (Server Message Block) shares, including support for
CIFS (Common Internet File System). The v2.6 kernel also sports a brand new
security architecture that departs somewhat from the standard Unix root user
concept; its modular security mechanism provides a greater level of
granularity to privileged user management.
Also introduced in the v2.6 kernel is a new approach to devices. The v2.4
kernel's devfs-based device handler has a companion in the v2.6 kernel. The
newcomer is udev and is an implementation of devfs, but in userspace. Using
udev, the system is able to follow devices as they move around on connected
busses, with the device identifier remaining static. For instance, the
first-seen SCSI device will remain as device sda, using the serial number of
the device as an identifier regardless of the order in which it's found
during a later boot. The use of udev is a significant change at the core of
the kernel and the cause of some consternation among Linux kernel
developers, with solid arguments provided by both sides. It looks like
udev/sysfs will be the standard in the future, deprecating devfs, but both
are present in the v2.6 kernel and are likely to remain for some time.
And yet another significant change to the v2.6 kernel is the merging of the
uClinux project into the core kernel. The uClinux project has been focused
on Linux kernel development for embedded devices. The main drive for this
functionality is support of processors lacking MMUs (Memory Management
Units), commonly found in microcontrollers for embedded systems such as fire
alarm controllers or PDAs. The list of embedded controllers that v2.6
supports is quite long, including common processors manufactured by Hitachi,
NEC, and Motorola. This definitely shows a separation from the roots of the
Linux kernel, as all prior kernels were more or less subject to the
limitations of the Intel x86 architecture.
Built for Speed
Prior to the release of the v2.6 kernel, Linux performed tasks on a
first-come, first-served basis; interrupting the kernel midtask to handle
another process or function was not in the cards. The v2.6 kernel, however,
can be pre-empted when needed, and can allocate resources for a process that
requires immediate attention, then resume processing on the interrupted
task. These interruptions are measured in fractions of a second, and are not
generally noticeable, but rather lend an overall feeling of smoothness to
system performance. The v2.6 kernel does not bring Linux to the point of
being an real-time operating system, but it goes a long way toward assuring
that tasks are addressed and completed when required.
At the core of these enhancements is a new process scheduler. The process
scheduler in the kernel divides CPU resources among system processes. The
performance of the scheduler directly impacts system responsiveness and
process latency. In the v2.6 kernel, the new 0(1) scheduler incorporates new
algorithms that can substantially increase system performance, especially
interactive tasks. The 0(1) scheduler can penalize CPU-hogging processes,
improves process prioritization, and provides consistent performance across
all processes. Also new in v2.6 are two new I/O schedulers. The scheduler
used in the v2.6 kernel by default, the anticipatory scheduler, brings much
improved handling of I/O scheduling, ensuring that processes get I/O time
when necessary, without unnecessary queuing. Also present is the deadline
scheduler, which assigns an expiration to requests using three queues, while
anticipatory scheduler attempts to anticipate process I/O requests before
they are actually requested.
There has been much debate over the scheduler used in this kernel, and there
is support for both schedulers, defined at boot time with options passed to
the kernel. The importance of scheduler performance cannot be overstressed.
My tests show that the anticipatory scheduler in v2.6 surpasses the v2.4
scheduler handily. Some of my tests show a tenfold performance increase. For
instance, a simple read of a 500MB file during a streaming write with a 1MB
block size on my Xeon-based test system took 37 seconds with v2.4.23, and
3.9 seconds with v2.6. The deadline scheduler also performs quite well, but
may not be as fluid for certain workloads as the anticipatory scheduler.
Either way, the new process and I/O schedulers blow v2.4's schedulers out of
In addition to the new scheduler, v2.6 has plenty of other major
architectural changes. The module handling code has been completely
rewritten, requiring a new set of userspace module utilities and mkinitrd
packages to function. These can be found as updates to most major Linux
distributions or via download. The new modutils and module kernel code is
much smoother than that found in v2.4, and permits a kernel to be compiled
without support for module unloading to ensure the integrity of the
Clocking the New Kernel
To test the new kernel, I opted for scenarios that would be most appropriate
for real-world users. Testing individual portions of the kernel, such as
disk I/O, memory management, and so on could be interesting, but what does
it mean for the overall system performance? In order to get the big picture,
I selected a few tests representative of expected server workloads and used
them to compare the performance of the v2.6 and v2.4 kernels.
Tests were run on three separate hardware platforms: Intel Xeon (x86), Intel
Itanium (IA-64), and AMD Opteron (x86_64). The x86 tests were conducted on
an IBM eServer x335 1U rack-mount server with dual 3.06GHz P4 Xeon
processors and 2GB of RAM. The Itanium tests were run on an IBM eServer x450
3U rack-mount server with dual 1.5GHz Itanium2 processors and 2GB of RAM.
And the Opteron tests were run on a Newisys 4300 3U rack-mount server with
dual 2.2GHz Opteron 848 processors and 2GB of RAM.
The base OS distribution used was Red Hat Linux Enterprise Server v3.0, but
the kernel testing relied on custom kernels compiled on each server. The
v2.4 tests utilized the official v2.4.23 kernel, and the v2.6 tests utilized
the official v2.6.0 kernel. Only the required modules and options were
compiled, and there were no other modifications made to the kernels, other
than those necessary for compilation on the various platforms, such as the
x86_64 patches for AMD64 from x86-64.org.
The file-sharing test was designed to mimic a standard Samba server
workload, and is based on Samba v3.0.1 with local authentication. The test
harness utilized the smbtorture tools included in the Samba package and was
run over Gigabit Ethernet. The tests were conducted with a simulation of 12
SMB clients communicating with a central server. The results of these tests
are almost too good to believe
On the Xeon system, the v2.4 kernel pushed 38.85MBps on average, and the
v2.6 kernel pushed 67.30MBps -- a 73 percent improvement. The Itanium tests
show similar performance differences between the kernels, giving v2.6 a 52
percent gain, albeit with smaller overall figures. And on the Opteron
system, which really showed its muscle in this test, the results were a
respectable 49.37MBps on the v2.4 kernel and an impressive 72.92MBps under
v2.6, an increase of roughly 48 percent.
The performance gains seen in the Samba tests are likely related to the
vastly improved scheduler and I/O subsystem in the v2.6 kernel. Disk I/O and
network I/O form the core of this test, and the performance improvements in
the v2.6 kernel are very visible here.
The database tests were also enlightening. The test scenario was based on
MySQL v3.23.58 and was run with the sql-bench test suite provided by MySQL.
All tests were run from a remote server to remove the impact of the client
suite running on the same server. In these tests the v2.6 kernel handily
beat the v2.4 kernel. The numbers in the chart represent the total amount of
time it took the systems to complete eight test procedures, but it does not
show the individual numbers from each tested procedure. All eight tests in
the sql-bench package were run on both kernels on all three hardware
Across the board, the v2.6 kernel outperformed the v2.4 kernel in the
database tests, especially on the Itanium box, where it posted a speed
increase of 23 percent (a 519-second lead) over the v2.4 kernel. On the Xeon
platform, v2.6 showed almost a 13 percent gain (a 200-second lead) over
v2.4. And on Opteron, it registered a 29 percent speed increase (a
415-second lead) over v2.4. The most impressive individual test was table
inserts, showing the v2.6 kernel providing a 10 percent performance increase
(with a 100-second lead) over v2.4 on Xeon, with even better results found
on the Opteron and Itanium platforms.
The Web server tests also showed significant improvement. The static page
test used a 21.5KB HTML page with two 25KB images served by Apache 2.0.48.
The test was measured in requests per second using Apache's ab benchmarking
tool. The Xeon tests show the v2.6 kernel outperforming v.2.4 by just under
1,000 requests per second, a 40 percent increase. The Itanium tests showed
v2.6 providing a 47 percent performance increase, while the Opteron tests
showed a 7 percent increase. It should be noted that the Opteron system
outperformed the other two servers by more than 1,000 requests per second
with the v2.4 kernel, and the smaller increase may be due to network
bandwidth constraints imposed on the server. In retrospect, I believe that
if I upped the network connectivity of the Newisys box with bonded Gigabit
Ethernet NICs, I could push it even faster.
My Web application tests were conducted using a custom CGI script written in
Perl, referencing a MySQL database running on the same system. The script
ran a single select on a column in the database, returning 97 rows of eight
columns, including one image. Again, Apache's ab was used to measure
performance. The overall numbers showed smaller performance increases than
the static tests, with the exception of the Opteron tests, but the 14
percent to 22 percent performance increases across all platforms are
My tests were geared to show the performance differences between the two
kernels on each hardware platform, not to compare the platforms. That said,
the Opteron's performance was outstanding; both the v2.4 and v2.6 kernels
posted impressive results across all tests but most dramatically in the
MySQL tests, showcasing the 64-bit support in v2.6. Overall, the v2.6 kernel
shows very impressive performance gains over v2.4, itself a well-performing
While I didn't run into many problems with the v2.6 kernel, there are a few
notable issues with the initial release. For example, the drivers for LSI
Logic's Fusion-MPT RAID controllers have some serious I/O problems in a
RAID1 configuration. When drives are addressed individually, there are no
issues, but this is a significant hindrance to v2.6 adopters running with
Fusion-MPT RAID controllers. These RAID modules are also problematic in the
v2.6 kernel for Opteron, causing a panic unless iommu=merge is passed to the
kernel at boot.
Further, on the Xeon platform, the v2.6 kernel compiles straight from the
official source without a hitch, but not so on Itanium and Opteron. Although
support for these platforms is present in the kernel, patches from specific
platform development efforts are required to compile v2.6. Once built, the
kernel boots normally, but requires the updated mkinitrd and modutils
packages to fully function. Other than the driver-related problems, the v2.6
kernel compiled, booted, and ran without problems on all three platforms,
handling with aplomb every test I threw its way.
Where From Here?
Today, the vast majority of production Linux systems run a version of the
v2.4 kernel. Those satisfied with the performance and functionality of this
kernel are not likely to make any sudden changes. If it ain't broke, don't
fix it. IT shops running big databases and other mission-critical
applications on v2.4 shouldn't necessarily jump on the bandwagon immediately
but should definitely begin testing v2.6. The v2.6 kernel is the new boss,
and it behooves any IT department to become familiar with its capabilities
and plan for adoption.
And what of the v2.4 kernel? Marcelo Tosatti, the Brazil-based maintainer of
the v.2.4 kernel, has announced on the LKML (Linux Kernel Mailing List) that
once v2.6 is officially released, v2.4 will indeed enter maintenance mode,
without further revision or major modification following the imminent
release of v2.4.25. This stance has been met with some derision within the
kernel development community and also amongst major corporate Linux
sponsors. At the crux of the issue are the major changes in the v2.6 kernel
and the fact that many manufacturers that continue to release binary-only
hardware drivers have been extremely slow to produce drivers for current
v2.4 branch kernels, to say nothing of the nascent v2.6 branch.
Also at issue are the fundamental changes in the core of the v2.6 kernel.
Most applications that function on v2.4 kernels will continue to do so on
v2.6. However, a few of the major changes could affect currently deployed
applications. For this reason, Red Hat, the dominant Linux distribution in
the United States, has decided to forego official v2.6 kernel support in its
recent Advanced Server and Enterprise Server products, opting to stay with
its highly customized v2.4.21 derivative kernel. However, Red Hat has
back-ported several key elements of the v2.6 kernel into its v2.4.21
Enterprise Linux kernel, such as support for up to 64GB of RAM, 16 CPUs,
IPSec, and NPTL. In this fashion, Red Hat is able to maintain application
compatibility while providing what it considers to be the most desired
features of the v2.6 kernel.
When building server architectures that could make use of the enhancements
of the v2.6 kernel, admins will need to configure and build custom kernels
tuned to their specific workloads. The problem with distribution-specific
kernels is that they tend to differ greatly from the official kernel
releases, both in the default option selections and the patches they
On the upside, these kernels are generally very broad in their hardware
support, as they are configured and built with nearly every module that
could possibly be used to ensure hardware compatibility for target systems.
They also tend to include patches that can either increase or decrease
performance, depending on the server workload. Admins who run these servers
are generally best served to patch, configure, and build a custom kernel for
their servers, both to ensure hardware compatibility and to squeeze out
performance increases when possible. The base distribution running the
server may require some modifications to accept a v2.6 kernel, such as the
addition of the new modutils and mkinitrd tools, but should otherwise
function normally with a new kernel.
As with any major development effort, bugs remain in the v2.6 kernel, and
are being actively pursued by the kernel developers. As of this writing,
kernel v2.6.2rc1 is available for download from kernel.org, and it includes
various bug fixes and enhancements over the v2.6 kernel released just a few
weeks ago. The process continues; those considering a move to v2.6 would be
well-advised to test the new kernel thoroughly before any production
The Linux kernel has come a long way since Linus Torvalds' announcement of
v0.1 in 1991. The v2.6 kernel boasts many new features as well as major
performance improvements over the v2.4 kernel and is poised to take Linux
into the next stage of the game: true enterprise adoption. To continue
making inroads into the datacenter, Linux must grow with the needs of the
established user base, as well as navigate previously uncharted waters to
appeal to those still looking in from outside. The v2.6 kernel appears to be
up to the task.
This message contains confidential information and is intended only
for the individual or entity named. If you are not the named addressee
you should not disseminate, distribute or copy this e-mail.
Please notify the sender immediately by e-mail if you have received
this e-mail by mistake and delete this e-mail from your system.
E-mail transmission cannot be guaranteed to be secure or error-free
as information could be intercepted, corrupted, lost, destroyed, arrive
late or incomplete, or contain viruses. The sender therefore does not
accept liability for any errors or omissions in the contents of this
message which arise as a result of e-mail transmission.
If verification is required please request a hard-copy version.
This message is provided for informational purposes and should not
be construed as an invitation or offer to buy or sell any securities or
related financial instruments.
GAM operates in many jurisdictions and is
regulated or licensed in those jurisdictions as required.
NYLXS: New Yorker Free Software Users Scene
Fair Use -
because it's either fair use or useless....
NYLXS is a trademark of NYLXS, Inc