2.6.33 merge window part 1
User-visible changes include:
- The ftrace framework has seen a number of improvements, including the
ability to trace multiple processes simultaneously,
regular expression support in tracing filters,
tracing of big kernel lock events, and
tracing of accesses and modifications to arbitrary kernel variables.
- Perhaps most significantly (for ftrace), the dynamic probes for ftrace patch set
has been merged, allowing the placement of arbitrary tracepoints at
run time. The "perf" tool has also been extended to be able to place
and use dynamic tracepoints.
- There are many other enhancements to "perf," including a new framework
for benchmark suites, a command to record and analyze kernel memory
allocations, and a generic scripting language hook set.
- Eric Biederman's long quest
to remove binary sysctl() support has finally made it into
the mainline.
- The recvmmsg()
system call has been added.
- The anticipatory I/O scheduler has been removed, in favor of CFQ which
is seen as providing a superset of features.
- The new, unified block I/O
bandwidth controller has been merged.
- The networking layer has gained support for TCP
cookie transactions [PDF], a mechanism which allows faster, more
secure, and more robust initiation of TCP connections.
- The DRBD distributed
block device has been merged.
- New drivers:
- Boards and processors:
ST-Ericsson U8500 boards,
Marvell Dove (88AP510) system-on-chip CPUs,
Palm Centro 685 phones, and
CompuLab CM-T35 boards.
- Networking: TI High End CAN controllers,
Intel Wireless MultiCom 3200 chips,
Ralink rt2800 wireless chipsets,
Microchip MCP251x SPI CAN controllers,
Freescale MSCAN-based CAN controllers, and
Solarflare SFC9000 10G Ethernet controllers.
- Sound: miroSOUND PCM20 radio tuners,
Texas Instruments TPA6130A2 stereo headphone amplifiers,
TI tlv320dac33 codecs,
Asahi Kasei AK4113 and AK4671 codecs,
WM8580 based audio subsystems on SMDK64xx systems,
Wolfson Micro WM8711/L sound devices, and
Raumfeld audio adapters.
- Miscellaneous: GRLIB APBUART serial ports, Oki MSM6242 realtime clock chips, and Ricoh RP5C01 RTCs.
- Boards and processors:
ST-Ericsson U8500 boards,
Marvell Dove (88AP510) system-on-chip CPUs,
Palm Centro 685 phones, and
CompuLab CM-T35 boards.
Changes visible to kernel developers include:
- There is a new unreachable() macro which can be used to
mark code which will never be executed. Its main application is in
macros like BUG().
- New security module hooks, intended for pathname-based modules, have
been added to check chmod(), chown(), and
chroot().
- There is a new RCU variant, called "tiny RCU," which is meant for
non-SMP situations where memory footprint must be minimized.
- printk_ratelimit() can, once again, be used in atomic
context. (Note, though, that there are developers who would like to
eliminate this function in favor of some sort of more local rate
limiting).
- The creation of nearly-identical tracepoints has been made significantly easier. TRACE_EVENT_TEMPLATE() has a syntax identical to TRACE_EVENT(), but it creates a template which can be used by the simpler DEFINE_EVENT() macro to create specific tracepoints. The code gets simpler, and, as a side benefit, the kernel gets smaller.
The merge window should stay open for at least another week; expect quite a
bit more code to be incorporated for 2.6.33 before the window closes.
Index entries for this article | |
---|---|
Kernel | Releases/2.6.33 |
(Log in to post comments)
2.6.33 merge window part 1
Posted Dec 10, 2009 8:00 UTC (Thu) by ebiederm (subscriber, #35028) [Link]
2.6.33 merge window part 1
Posted Dec 10, 2009 9:37 UTC (Thu) by wahern (subscriber, #37304) [Link]
For example, my portable arc4random--which uses sysctl(CTL_KERN, KERN_RANDOM, RANDOM_UUID)--will break. Requiring people to seed before the chroot happens, or requiring users to create device files in the chroot tree doesn't help; those things aren't required on other platforms.
One plus is that there'd be less kernel exposure in a chroot without either /proc or sysctl. And certainly in general removing code is good, though /proc has historically been riddled with kernel exploits; far more than sysctl ever produced. Indeed, the mere existence of /proc outside the chroot has its own problems, like exposing file descriptors--pipes, socketpairs--that would otherwise be unaddressable by other processes. Thus one of the strongest security characteristics--using descriptors as ad hoc "capability" tokens--is totally broken. File permissions aren't nearly as strong a security mechanism as the inability to reference the object.
2.6.33 merge window part 1
Posted Dec 10, 2009 9:55 UTC (Thu) by johill (subscriber, #25196) [Link]
"He then adds back a new wrapper which emulates the sysctl() ABI by way of /proc/sys. So any applications using sysctl() should continue to work, but the code dedicated to making it work is much reduced from what was there before."
I don't think that wrapper actually requires it to be mounted.
2.6.33 merge window part 1
Posted Dec 10, 2009 16:20 UTC (Thu) by ebiederm (subscriber, #35028) [Link]
sysctl(2) support to work.
2.6.33 merge window part 1
Posted Dec 10, 2009 16:37 UTC (Thu) by ebiederm (subscriber, #35028) [Link]
A few comments.
arc4random prefers to use /dev/urandom and tries that first so even
inside a nicely setup chroot it will work.
sysctl was absolutely riddled with exploitable code, when I started working on it, and a hole was closed just a few weeks ago. It just happens that no one not even those who exploit kernel issues for the fame looked at the implementation details of sysctl.
I will agree that the sysctl format of only exporting simple integer and string values is much harder to exploit, and as such is a good idea.
As for the file descriptors they are not exposed to other users. The permissions on /proc/<pid>/fd/ are limited. Except for one esoteric corner case you can't do anything more with the file descriptors in proc than you could by attaching a debugger. Using file descriptors as ad hoc "capability" tokens is not broken in any way that I am aware of.
2.6.33 merge window part 1
Posted Dec 10, 2009 19:45 UTC (Thu) by spender (guest, #23067) [Link]
What's the CVE for the vulnerability that was fixed?
-Brad
2.6.33 merge window part 1
Posted Dec 10, 2009 20:29 UTC (Thu) by wahern (subscriber, #37304) [Link]
Though, I'll admit then that Linux wouldn't be the first to break this behavior (if indeed it did, which it hasn't yet). I'll have to fix my apps to stir before any chroot.
As for /proc/$$/fd: take Apache as an example. Site A can access descriptors--specifically anonymous pipes--of site B. That the process for site A could theoretically attach itself to site B is beside the point. Typically both processes are running virtual machines and/or interpreters where debugging interfaces aren't available. Regular file routines, however, are usually available. Breaking out of a VM is significantly more difficult than coaxing a script to eval code.
Requiring a different process user for every site is impracticable, unless perhaps the kernel could provide ephemeral UIDs. Anyhow, you can drop ptrace capabilities, yet because of the growing necessity of /proc it's increasingly just as impracticable to not have /proc mounted.
With the rise of "cloud computing" (née SaS, née time-sharing systems), the notion that privileges are necessarily tied to persistent objects or system-wide credentials is short-sighted. The operating system should provide certain primitives and behaviors that allow applications to create ad hoc privilege systems enforceable by the hardware, e.g. the MMU. Solutions like SELinux, or any other system-wide _explicit_ access control, miss the mark entirely in almost every way imaginable.
2.6.33 merge window part 1
Posted Dec 19, 2009 10:48 UTC (Sat) by jengelh (subscriber, #33263) [Link]
And for where it matters, glibc could emulate sysctl() by going to /proc/sys instead.