Fun with NULL pointers, part 1

Please consider subscribing to LWN

Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net.

By Jonathan Corbet
July 20, 2009

By now, most readers will be familiar with the local kernel exploit recently posted by Brad Spengler. This vulnerability, which affects the 2.6.30 kernel (and a test version of the RHEL5 "2.6.18" kernel), is interesting in a number of ways. This article will look in detail at how the exploit works and the surprising chain of failures which made it possible.

The TUN/TAP driver provides a virtual network device which performs packet tunneling; it's useful in a number of situations, including virtualization, virtual private networks, and more. In normal usage of the TUN driver, a program will open /dev/net/tun, then make an ioctl() call to set up the network endpoints. Herbert Xu recently noticed a problem where a lack of packet accounting could let a hostile application pin down large amounts of kernel memory and generally degrade system performance. His solution was a patch which adds a "pseudo-socket" to the device which can be used by the kernel's accounting mechanisms. Problem solved, but, as it turns out, at the cost of adding a more severe problem.

The TUN device supports the poll() system call. The beginning of the function implementing this functionality (in 2.6.30) looks like this:

    static unsigned int tun_chr_poll(struct file *file, poll_table * wait)
    {
	struct tun_file *tfile = file->private_data;
	struct tun_struct *tun = __tun_get(tfile);
	struct sock *sk = tun->sk;
	unsigned int mask = 0;

	if (!tun)
	    return POLLERR;

The line of code which has been underlined above was added by Herbert's patch; that is where things begin to go wrong. Well-written kernel code takes care to avoid dereferencing pointers which might be NULL; in fact, this code checks the tun pointer for just that condition. And that's a good thing; it turns out that, if the configuring ioctl() call has been made, tun will indeed be NULL. If all goes according to plan, tun_chr_poll() will return an error status in this case.

But Herbert's patch added a line which dereferences the pointer prior to the check. That, of course, is a bug. In the normal course of operations, the implications of this bug would be somewhat limited: it should cause a kernel oops if tun is NULL. That oops will kill the process which made the bad system call in the first place and put a scary traceback into the system log, but not much more than that should happen. It should be, at worst, a denial of service problem.

There is one little problem with that reasoning, though: NULL (zero) can actually be a valid pointer address. By default, the very bottom of the virtual address space (the "zero page," along with a few pages above it) is set to disallow all access as a way of catching null-pointer bugs (like the one described above) in both user and kernel space. But it is possible, using the mmap() system call, to put real memory at the bottom of the virtual address space. There are some valid use cases for this functionality, including running legacy binaries. Even so, most contemporary systems disable page-zero mappings through the use of the mmap_min_addr sysctl knob.

[PULL QUOTE: Security module checks are supposed to be additive to the checks which are already made by the kernel, but it didn't work that way this time. END QUOTE] This knob should prevent a user-space program from mapping the zero page, and, thus, should ensure that null pointer dereferences cause a kernel oops. But, for unknown reasons, the mmap() code in the 2.6.30 kernel explicitly declines to enforce mmap_min_addr if the security module mechanism has been configured into the kernel. That job, instead, is left to the specific security module being used. Security module checks are supposed to be additive to the checks which are already made by the kernel, but it didn't work that way this time; with regard to page zero, security modules can grant access which would otherwise be denied. To complete the failure, Red Hat's default SELinux policy allows mapping the zero page. So, in this case, running SELinux actually decreased the security of the system.

Not that life is a whole lot better without SELinux. In the absence of SELinux, the exploit will run up against the mmap_min_addr limit, which would seem like enough to bring things to a halt. That particular difficulty can be circumvented, though, through the use of the personality() system call. Enabling the SVR4 personality causes a read-only page to be mapped at address zero when a program is invoked with exec(), but only if the process in question has the CAP_SYS_RAWIO capability. So one more trick is required: the top-level exploit code will set the SVR4 personality, then use exec() to run the pulseaudio server with a special plugin module. Pulseaudio is installed setuid root, so it will get the zero page mapped at invocation time. By the time the plugin code is called, pulseaudio has dropped its privileges, but, by then, the zero page will be available to the exploit code, which can make the page writeable and place its own data there.

As a result of all this, it is possible for a user-space process to map the zero page and prevent tun_chr_poll() from causing a kernel oops. But, one would think, that would not get an attacker very far, since that function checks tun against NULL as the very next thing it does. This is where the next interesting step in the chain of failures happens: the GCC compiler will, by default, optimize the NULL test out. The reasoning is that, since the pointer has already been dereferenced (and has not been changed), it cannot be NULL. So there is no point in checking it. Once again, this logic makes sense most of the time, but not in situations where NULL might actually be a valid pointer.

So, an attacker is able to get into the body of tun_chr_poll() with a NULL tun pointer. One then needs to figure out how to get control of the kernel using this situation. The next step takes advantage of this code from a little further into tun_chr_poll():

	if (sock_writeable(sk) ||
	    (!test_and_set_bit(SOCK_ASYNC_NOSPACE, &sk->sk_socket->flags) &&
	     sock_writeable(sk)))
		mask |= POLLOUT | POLLWRNORM;

The value of sk, remember, came from the dereferencing of tun, so it's under the attacker's control. SOCK_ASYNC_NOSPACE is zero, so the test_and_set_bit() call can be used to unconditionally set the least significant bit of any word in memory. As kernel memory corruptions go, this is a small one, but it turns out to be enough. In Brad's exploit, sk->sk_socket->flags points into the TUN driver's file_operations structure; in particular, it points to the mmap() function. The TUN driver does not support mmap(), so that pointer is normally NULL; after the poll() call, that pointer is now one instead.

The final step in the exploit is to call mmap() on a file descriptor for the open TUN device. Since the internal mmap() operation is no longer NULL (it has been set to one), the kernel will jump to it. That address also lives within the zero page mapped by the exploit, so it is under the attacker's control. The exploit will have populated that address with another jump to its own code. So, when the kernel calls (what it thinks is) the TUN driver's mmap() function, the result is arbitrary code being run in kernel mode; at that point the exploit has total control.

In well-designed systems, catastrophic failures are rarely the result of a single failure. That is certainly the case here. Several things went wrong to make this exploit possible: security modules were able to grant access to low memory mappings contrary to system policy, the SELinux policy allowed those mappings, pulseaudio can be exploited to make a specific privileged operation available to exploit code, a NULL pointer was dereferenced before being checked, the check was optimized out by the compiler, and the code used the NULL pointer in a way which allowed the attacker to take over the system. It is a long chain of failures, each of which was necessary to make this exploit possible.

This particular vulnerability has been closed, but there will almost certainly be others like it. See the second article in this series for a look at how the kernel developers are responding to this exploit.

Index entries for this article
Kernel	Security/Vulnerabilities
Security	Linux kernel
Security	Vulnerabilities/Privilege escalation

(Log in to post comments)

Fun with NULL pointers, part 1

Posted Jul 20, 2009 20:14 UTC (Mon) by clugstj (subscriber, #4020) [Link]

It's even a little more convoluted than mentioned in the next-to-the-last paragraph. Since a pointer was set to 1, it also requires that the processor fetch opcodes at byte boundaries. No RISC processor would do that (AFAIK).

Fun with NULL pointers, part 1

Posted Jul 20, 2009 20:18 UTC (Mon) by jengelh (subscriber, #33263) [Link]

Yeah but the fact that x86en supporting unaligned access dominate makes up for it :-/

Fun with NULL pointers, part 1

Posted Jul 20, 2009 21:38 UTC (Mon) by spender (guest, #23067) [Link]

I could have written to the 2nd byte instead (choose your endianness) and the resulting address would be aligned.
The kernel actually being able to use that address directly would depend on the architecture.

I only chose the first byte because I already had my mapping at NULL, so it was easy to reuse it. The exploit primitive there though allows an arbitrary OR of 0x1 to any byte in memory.

-Brad

Fun with NULL pointers, part 1

Posted Jul 20, 2009 22:27 UTC (Mon) by spender (guest, #23067) [Link]

Clarification/fix: Since the OR is performed on an unsigned long instead of a single byte, then the address of the target may be subject to whatever alignment on architectures that care.

-Brad

RISC can do that

Posted Jul 22, 2009 18:13 UTC (Wed) by klossner (subscriber, #30046) [Link]

PowerPC silently drops the two low bits when loading an address into the PC, so a branch to 1 becomes a branch to 0. The misaligned-address exception occurs only for load/store instructions.

Fun with NULL pointers, part 1

Posted Jul 20, 2009 20:14 UTC (Mon) by jengelh (subscriber, #33263) [Link]

Once upon a dereference

>pulseaudio is installed setuid root

*cough*

Fun with NULL pointers, part 1

Posted Jul 20, 2009 21:39 UTC (Mon) by proski (subscriber, #104) [Link]

We should ensure that setuid programs exit immediately if they are using non-default personalities.

Fun with NULL pointers, part 1

Posted Jul 20, 2009 21:53 UTC (Mon) by drag (guest, #31333) [Link]

It would be a lot better just to do away with setuid binaries altogether.

That's what things like policykit and dbus are for...

Fun with NULL pointers, part 1

Posted Jul 20, 2009 23:33 UTC (Mon) by nix (subscriber, #2304) [Link]

Oh yeah great. So instead of locating setuid binaries and knowing that
*that* is my vulnerability surface, I have to parse a huge pile of XML and
hope that there are no bugs in policykit and dbus that might cause
unintended things to be run (and we know there have been *none* like that
before). The setuid implementation in the kernel is tiny and trivially
auditable by comparison, sharing virtually all its code with the
tested-to-death-and-hopefully-audited ELF execve() implementation.

PolicyKit has been done right once before. It was called 'userv'.
PolicyKit itself is a huge step backwards if you actually want security.

Fun with NULL pointers, part 1

Posted Jul 21, 2009 7:21 UTC (Tue) by gmaxwell (guest, #30048) [Link]

Yea policy kit. Great stuff.

By default fedora allows the desktop users to change the system time. All they must do is ender the *user's* password (not root!) and even that they only have to do it once.

Great stuff great stuff.

Although many people have pointed out the terrible security implications nothing has been done. Sometimes it really does take some high profile compromises to get things fixed.

Fun with NULL pointers, part 1

Posted Jul 21, 2009 9:45 UTC (Tue) by cortana (subscriber, #24596) [Link]

But that's not the fault of PolicyKit itself. Rather it's the fault of the distributor who shipped it with a policy that a) allows an unprivileged user to change the system time and b) does not force them to re-authenticate whenever they wish to do so.

Concerns about the increased vulnerability surface caused by the complexity of PolicyKit are still justified, but Fedora's default policy being stupid is not relevant to that discussion. If we wanted to blame the system for allowing the user to do stupid things then we may as well all give up and move back to Windows. :)

Fun with NULL pointers, part 1

Posted Jul 21, 2009 9:56 UTC (Tue) by nix (subscriber, #2304) [Link]

The concern isn't just that the vulnerability surface has increased: it's that we can't even easily tell what it is anymore.

Determining the set of privileged code that could carry out operations on behalf of unprivileged users was fairly simple in the days before PolicyKit: find setuid/setgid binaries, chase their shared library dependencies and (if you're paranoid) see what they can dlopen(). Just a grep away, in any case.

Now, we have to analyze the dbus and PolicyKit policies as well, and XML is... not terribly amenable to analysis with Unix-style shell tools. (Some Perl packages come with XML-style XPath-based grep tools, but they are a) rarely installed and b) seriously cumbersome. We really need an awk for XML.)

Fun with NULL pointers, part 1

Posted Jul 21, 2009 12:52 UTC (Tue) by nim-nim (subscriber, #34454) [Link]

> We really need an awk for XML.

Just use xsltproc directly (though not having to use a detached xslt file would be nice)

Fun with NULL pointers, part 1

Posted Jul 22, 2009 22:00 UTC (Wed) by nix (subscriber, #2304) [Link]

Ew, no. Utterly un-awklike and doing awk-like transformations with XSLT is
really quite painful. (And yes, you can do awklike languages for things
other than text streams: see gvpr(1) for example.)

(One of many problems is XSLT's heavy use of <>, which makes it very
annoying to use from the shell prompt. Another is its astonishing
verbosity. Another is its total lack of good taste in design... also the
functional nature of it, while one of its nicer aspects, fits very badly
with the shell in my experience.)

Fun with NULL pointers, part 1

Posted Jul 21, 2009 14:01 UTC (Tue) by gmaxwell (guest, #30048) [Link]

Poliykit plays a role: If you go look at the discussions on the fedora list you'll see that there was some degree of argument what the actual behaviour was Was it asking for a password at all (some people thought it wasn't because it only did so once) and was it asking for the root password? A lot of people had used and never realized that it was asking for their user password rather than root.

SUID is more unambiguous.

xml awk

Posted Jul 22, 2009 7:08 UTC (Wed) by Frej (guest, #4165) [Link]

xml cli tool
http://xmlstar.sourceforge.net/

xml awk

Posted Jul 22, 2009 11:18 UTC (Wed) by nix (subscriber, #2304) [Link]

That looks nice. Not very awkish though...

Fun with NULL pointers, part 1

Posted Jul 21, 2009 10:18 UTC (Tue) by cortana (subscriber, #24596) [Link]

You don't have to start at the XML. You can ask the PolicyKit daemon itself what you are currently allowed to do by running polkit-auth; you can get a list of everything you could possibly do with additional authentication by running polkit-auth --show-obtainable; and you can see all the actions that it is possible to perform on a system, and find out which users may perform that action, by running polkit-gnome-authorization. Of course, this requires you to trust that the PolicyKit daemon has not already been compromised. ;)

I also think the docs for PolicyKit are pretty good; the library reference manual contains a Design Overview at <http://hal.freedesktop.org/docs/PolicyKit/ref-design.html> that explains how all the parts of the system fit together.

Of course the system is much more complex than the setuid mechanism in the kernel, but--assuming it is audited well--it has the potential to increase system security a lot, because it would allow you to eliminate other setuid programs on the system.

There is certainly no shortage of badly written programs that are installed setuid, but fail to give away all additional privileges but the exact ones it wishes to keep and deal with any of the infinite combinations of things an attacker may do to them before exec is called, such as messing around with the PATH or another similarly dangerous environment variable; creating symlinks that will cause a badly written 'create temporary file' routine to overwrite vital system files; mapping memory to address 0 to bypass security checks in the kernel (http://lwn.net/Articles/342330/ anyone?) :)

And of course, setuid does not only consist of one system call, but several (setuid, seteuid, setreuid and setresuid), all with subtly different semantics--sometimes modifying the saved user id, sometimes the real user id, yet others the effective uid... oh, and don't forget setfsuid and the capability mechanism. And the fact that different versions of different operating systems all implement subtly different semantics for these system calls... check out Setuid Demystified <http://www.eecs.berkeley.edu/~daw/papers/setuid-usenix02.pdf> and the Secure Programming for Linux and Unix HOWTO <http://www.dwheeler.com/secure-programs/Secure-Programs-H...> for more details and other attack mechanisms.

So yeah, PolicyKit is complicated--but is it more complicated than the intersection of the setuid mechanism and all the other stuff an attacker can do to a setuid process before it is executed? :)

Fun with NULL pointers, part 1

Posted Jul 22, 2009 11:19 UTC (Wed) by nix (subscriber, #2304) [Link]

So the only way you can see what actions can be run on a system with privilege is... to run a GNOME-specific GUI application?

*sigh*

Fun with NULL pointers, part 1

Posted Jul 22, 2009 12:49 UTC (Wed) by cortana (subscriber, #24596) [Link]

It's not GNOME-specific. It just uses GTK+. Yes, it's unfortunate that there is no command-line equivalent, but it shouldn't be too difficult for one to be written--you are just querying the PolicyKit object after all.

Fun with NULL pointers, part 1

Posted Jul 22, 2009 21:59 UTC (Wed) by nix (subscriber, #2304) [Link]

I think one is essential if PolicyKit can be considered safe to use: we
have to be able to do automated audits and yell at unexpected quiet
changes, lest we miss attacks or accidental fumbles opening up holes.

You're quite right on the fugliness of the UID-setting syscalls in Unix,
but this is not helped by introducing *more* :/ again, userv did all this
better (IMNSHO) many years ago. Why more people don't use it I have no
idea.

Fun with NULL pointers, part 1

Posted Aug 7, 2009 12:21 UTC (Fri) by jschrod (subscriber, #1646) [Link]

I'll have to say that I don't find PolicyKit's documentation very good. I don't know how it is for a developer, but for me as a long-time Unix admin it was frustrating when I had to delve into it a few weeks ago for the first time.

The document linked by you was one of the 1st that I read, btw. It's not that I didn't understand what PolicyKit was supposed to be doing, that was quite clear from the start. And not even that I couldn't read the config files -- while being rather chatty, they were understandable.

Where I didn't succeed, is finding concise information about the overall architecture: how PolicyKit / ConsoleKit / HAL / D-Bus / pam / login processes, both X and console, are *supposed* to work together. That would have delivered pointers to information sources that could have answered questions like Where are the names in the PolicyKit XML files from? Which possible names exist on a given system? Which daemons are started and by whom? What are common PolicyKit authorizations, since it is all about policy and nothing about content? The output of polkit-auth --show-obtainable is not sufficient, IMHO. E.g., I'd need to know what org.freedesktop.hal.lock is, and googling it on site:freedesktop.org does _not_ provide an adequate answer, at least not for me. (And yes, I've read http://people.freedesktop.org/~david/hal-spec/hal-spec.html.)

It was not easy to find the answers to such questions in reasonable time frames because the overall structure had to be laborously reverse engineered by yours humbly. Well, maybe my Google foo may simply be not good enough.

Just my 0.02 EUR. :-)

Joachim

Fun with NULL pointers, part 1

Posted Jul 22, 2009 2:14 UTC (Wed) by Baylink (guest, #755) [Link]

"vulnerability surface".

Wow. That's an altogether neat concept. Thanks.

Fun with NULL pointers, part 1

Posted Jul 25, 2009 2:13 UTC (Sat) by mikov (guest, #33179) [Link]

I am with you.
In my mind the key to reasonable security is simplicity. SUID is extremely simple. Compare changing a user's password using passwd to what has to happen in Windows.

Fun with NULL pointers, part 1

Posted Jul 21, 2009 8:37 UTC (Tue) by mmahut (guest, #45550) [Link]

It will eventually go away if we start using file system compatibilities. As far as I know, they've been already implemented in Fedora 10, but RPM will support it just from Fedora 12.

http://udrepper.livejournal.com/20709.html

Fun with NULL pointers, part 1

Posted Jul 21, 2009 0:26 UTC (Tue) by kpfleming (subscriber, #23250) [Link]

What? pulseaudio is setuid-root *and* allows non-privileged users to load modules into its process space? What kind of insanity is that?

Fun with NULL pointers, part 1

Posted Jul 21, 2009 2:13 UTC (Tue) by smurf (subscriber, #17840) [Link]

A setuid program loading unprivileged modules usually isn't a problem if that program drops its privs before doing so.

Usually.

This whole mess goes to show that security is a hard-to-solve problem.

Taking privs off pulseaudio (Linux has been capable of doing better than all-out-setuid for _how_ long, exactly ?!?) may or may not be the solution in this case, I haven't checked.

Fun with NULL pointers, part 1

Posted Jul 22, 2009 14:24 UTC (Wed) by mezcalero (subscriber, #45103) [Link]

pa 0.9.16 is not setuid anymore, because the introduction of rtkit makes that unnecessary.

Fun with NULL pointers, part 1

Posted Aug 15, 2009 9:01 UTC (Sat) by gvy (guest, #11981) [Link]

Yeah, previously one had to introduce a rootkit. Now it's half-done by prominent distributors. :(

Fun with NULL pointers, part 1

Posted Jul 20, 2009 21:07 UTC (Mon) by nhippi (guest, #34640) [Link]

1) If gcc can detect the unnecessary NULL check, cant it warn about it? or coverity give a DEAD_CODE notification?

2) Does anyone use personality(SVR4) these days? The older personalities could be moved behind a switch (/proc/sys/vm/legacy_personality = 0), where the default is off. While it is only small part of this exploit, this would again make future exploits slighly harder (until someone notices another piece of ignored legacy code).

Fun with NULL pointers, part 1

Posted Jul 20, 2009 23:39 UTC (Mon) by ncm (guest, #165) [Link]

(1) Coverity has checks for this. The error is so common that it makes up a large fraction of warnings in typical Coverity reports.

(2) Gcc certainly should have a warning for this. However, putting such a warning in "-W" would make people upset (see (1)). That's not meant to be an argument against. It might be easier to get it into -Wall, but it is generally very hard to move a warning from -W to -Wall or the reverse, so if it gets into -Wall it is probably stuck there forever. The warning would likely be issued only when compiling with strong optimization; without optimization the compiler would be unlikely to perform the analysis that would lead to it noticing the problem.

(3) Depending on the offset within the struct type pointed to, the pointer value dereferenced might not refer to zero page, and thus might not trigger a SEGV signal in normal user-level code, even without memory-mapping games.

Fun with NULL pointers, part 1

Posted Jul 22, 2009 2:18 UTC (Wed) by Baylink (guest, #755) [Link]

Am I the only person who read "has checks for this" and thought "a wart is a crocky feature"?

Fun with NULL pointers, part 1

Posted Jul 20, 2009 21:11 UTC (Mon) by taviso (subscriber, #34037) [Link]

Note that the mmap_min_addr bypass using personality() was discovered by me and Julien Tinnes, we described it here. We also sent a patch that corrected it to lklm here http://patchwork.kernel.org/patch/32598/

Fun with NULL pointers, part 1

Posted Jul 23, 2009 5:01 UTC (Thu) by johnflux (guest, #58833) [Link]

You, sir, are a coding god.

Fun with NULL pointers, part 1

Posted Jul 24, 2009 16:06 UTC (Fri) by rogue@autistici.org (guest, #59770) [Link]

And, may you be glorified for your find as much as spender is for his exploit.

Fun with NULL pointers, part 1

Posted Jul 20, 2009 21:15 UTC (Mon) by spender (guest, #23067) [Link]

My last name is Spengler (no idea why people assume my alias is my last name). The analysis of the exploit appears correct however.

The choice of the tun file_operations struct was arbitrary: a different one could have been chosen if the attacker wanted the exploit to work against a custom kernel with CONFIG_DEBUG_RODATA enabled. As I've found, since 2007 when most of those structs were made const, people haven't kept up with the standard, so there are a ton of other reliable function pointers to choose from.

The nature of the NULL tun pointer being confined to the tun_chr_poll() function (instead of getting leaked out through some means to other complex functions in the kernel) is what makes the vulnerability 100% reliably exploitable.

-Brad

Fixed

Posted Jul 20, 2009 21:18 UTC (Mon) by corbet (editor, #1) [Link]

My last name is Spengler (no idea why people assume my alias is my last name).

Because they look somewhat the same and we make silly mistakes? I'm sorry about this one; it's been fixed.

Fixed

Posted Jul 20, 2009 21:34 UTC (Mon) by spender (guest, #23067) [Link]

No problem, it seems to be a common mistake ;)

-Brad

mmap_min_addr and security modules

Posted Jul 20, 2009 21:47 UTC (Mon) by fjpop (guest, #30115) [Link]

> But, for unknown reasons, the mmap() code in the 2.6.30 kernel
> explicitly declines to enforce mmap_min_addr if the security module
> mechanism has been configured into the kernel. That job, instead, is
> left to the specific security module being used.

There are plenty of systems where CONFIG_SECURITY_SELINUX is set, but
where selinux is either not installed or not activated.

So if the quoted text is correct, then all those systems would be missing
an apparently useful (basic?) security check. Or is the text imprecise
and does the kernel check if a security module is active before ignoring
mmap_min_addr?

mmap_min_addr and security modules

Posted Jul 20, 2009 21:53 UTC (Mon) by corbet (editor, #1) [Link]

The code which performs the check was:

static inline unsigned long round_hint_to_min(unsigned long hint)
{
#ifdef CONFIG_SECURITY
	hint &= PAGE_MASK;
	if (((void *)hint != NULL) &&
	    (hint < mmap_min_addr))
		return PAGE_ALIGN(mmap_min_addr);
#endif
	return hint;
}

So it was taken out at compile time; the presence of an actual security module is not really relevant.

mmap_min_addr and security modules

Posted Jul 20, 2009 22:15 UTC (Mon) by spender (guest, #23067) [Link]

That's not the right check. security_file_mmap (which is either set by the capabilities module or overriden by the SELinux module) is what implements the final check. The one you pasted doesn't even apply for MAP_FIXED but is just to ensure that the allocator doesn't choose an address below mmap_min_addr when only a hint is specified.

If SELinux is compiled into the kernel, it needs to be disabled at boot via the kernel command-line, otherwise it registers its hooks with LSM and overrides that of the capabilities module for security_file_mmap which performs the mmap_min_addr check.

-Brad

Fun with NULL pointers, part 1

Posted Jul 20, 2009 22:25 UTC (Mon) by mjw (subscriber, #16740) [Link]

Nice overview. What trick was used to actually open the /dev/net/tun device?
On my CentOS systems it is:
crw------- 1 root root 10, 200 Jul 16 12:18 /dev/net/tun
so doesn't seem to be accessible by non-root normally.

Fun with NULL pointers, part 1

Posted Jul 20, 2009 23:13 UTC (Mon) by dwmw2 (subscriber, #2063) [Link]

On up to date systems it should have mode 0666. Only users with CAP_NET_ADMIN can create new tun devices, but then they can be made persistent and given to specific users/groups — who need to be able to open /dev/net/tun in order to attach to them.

Fun with NULL pointers, part 1

Posted Jul 20, 2009 23:18 UTC (Mon) by eparis (guest, #33060) [Link]

apparently some udev script which makes it world rw, I'm told (but haven't verified) that some VPN program changes it so they can run as normal users...

Fun with NULL pointers, part 1

Posted Jul 21, 2009 16:10 UTC (Tue) by dwmw2 (subscriber, #2063) [Link]

"apparently some udev script which makes it world rw, I'm told (but haven't verified) that some VPN program changes it so they can run as normal users..."

Yes, the udev script creates it with mode 0666 because that's the recommended configuration.

It's been possible to make tun devices that can be used by non-root since February 2002.

However, it was only in June 2006 that we made it reasonable to have 0666 permissions on /dev/net/tun, by adding the CAP_NET_ADMIN checks before creating new devices.

The OpenConnect VPN client, when used in conjunction with its NetworkManager plugin, will use this facility to run as its own unprivileged user. After the stupid tmpfile races we saw in Cisco's own client which runs as root, it seemed like an appropriate design choice for limiting security exposure (even though I couldn't possibly be as incompetent as the Cisco engineers).

Fun with NULL pointers, part 1

Posted Jul 20, 2009 23:53 UTC (Mon) by job (guest, #670) [Link]

The fact that those changes make it into the kernel in the first place is slightly worrying. Should an unchecked pointer deference not be caught with a static code checker? What about Sparse, isn't that run routinely? Perhaps automatic comments could be added to code already at the mailing list publication stage.

Fun with NULL pointers, part 1

Posted Jul 21, 2009 0:00 UTC (Tue) by spender (guest, #23067) [Link]

A static checker is useless unless someone actually bothers to fix the things it reports.

It's already been agreed upon that static checkers can find these bugs, yet clearly they're being produced faster than they're fixed.

-Brad

Fun with NULL pointers, part 1

Posted Jul 21, 2009 0:04 UTC (Tue) by proski (subscriber, #104) [Link]

Sparse doesn't check for it yet. Perhaps it should. But please realize that it's not just an "unchecked code dereference", it's a dereference before check.

A zero pointer is not a null pointer

Posted Jul 21, 2009 10:57 UTC (Tue) by epa (subscriber, #39769) [Link]

Since the zero page can be mapped and therefore 0x0 is a valid address, the zero bit pattern should not be used for the null pointer. The null pointer is supposed to be an invalid pointer that never points to any addressable memory. From the C99 standard:

If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.

The C runtime should use some other magic value to represent the null pointer - and the kernel should promise never to map any addressable memory there.

A zero pointer is not a null pointer

Posted Jul 21, 2009 11:07 UTC (Tue) by tialaramex (subscriber, #21167) [Link]

Now why would you want people to spill coffee all over their keyboards?

A zero pointer is not a null pointer

Posted Jul 22, 2009 9:02 UTC (Wed) by epa (subscriber, #39769) [Link]

It would be possible to keep 0x0 for the null pointer while respecting the C99 standard: the compiler would need to put in a couple of extra instructions for every pointer comparison making sure that 0x0 != P for any value of P. Optionally, it could also put in a check before every pointer dereference making sure the pointer is not 0x0 and causing a SIGSEGV if it is (since the memory layout can no longer be relied on to guarantee that).

A zero pointer is not a null pointer

Posted Jul 24, 2009 20:38 UTC (Fri) by giraffedata (guest, #1954) [Link]

It would be possible to keep 0x0 for the null pointer while respecting the C99 standard: the compiler would need to put in a couple of extra instructions for every pointer comparison making sure that 0x0 != P for any value of P.

Do we know Gcc doesn't do this? Seems like it would have to, to be C99 compliant.

Optionally, it could also put in a check before every pointer dereference making sure the pointer is not 0x0 and causing a SIGSEGV if it is (since the memory layout can no longer be relied on to guarantee that).

But then how would you represent a pointer to a data structure that resides at address 0? A pointer should be able to do that.

It would of course be unrealistically expensive on typical machines to represent a pointer with anything but a simple address, but pointer comparisons are rare enough that a few extra instructions for them seems worthwhile to maintain the null pointer concept.

Regardless of how the compiler chooses to represent pointers (null or otherwise), the optimization in question is logically sound. C99 says a dereference of a null pointer causes undefined behavior, so either a) tun is non-null and !tun must be false or b) tun is null and !tun can be anything, including false.

A zero pointer is not a null pointer

Posted Jul 24, 2009 22:06 UTC (Fri) by nix (subscriber, #2304) [Link]

GCC certainly doesn't insert code checking if a pointer is NULL before
every pointer dereference. Considering how common pointer dereferencing is
in C and related languages, this would be a substantial slowdown for no
real gain, given that multiuser OSes invariably trap such things, and
non-multiuser OSes are specialist environments in which attacks by hostile
local users are not so common (yet).

That kernel space has to work when the lower part of its address space is
effectively under the control of a hostile attacker is a unique problem
which it is really not worth changing the C standard for, nor imposing
vast overheads on all userspace code. -fno-delete-null-pointer-checks does
the job.

A zero pointer is not a null pointer

Posted Jul 25, 2009 2:51 UTC (Sat) by giraffedata (guest, #1954) [Link]

GCC certainly doesn't insert code checking if a pointer is NULL before every pointer dereference.

Sure, but that wouldn't improve standards compliance anyway. I asked if GCC generates extra code to comply with the C99 requirement that a null pointer not be equal to any non-null one (while still allowing the existence of pointers to a data structure that resides at address 0). Thinking about it now, though, I don't see how any such code is possible since a null pointer still has to compare equal to another null pointer.

That kernel space has to work when the lower part of its address space is effectively under the control of a hostile attacker is a unique problem which it is really not worth changing the C standard for,

There has been no proposal to deal with this by changing the standard, which GCC apparently ignores anyhow. And objection to GCC's conflation of null pointers and zero-address pointers wasn't that it's a security problem but that it's a basic correctness problem. Even without a hostile page 0, unless you proclaim data structures at address 0 don't exist, this optimization breaks code.

A zero pointer is not a null pointer

Posted Jul 25, 2009 12:49 UTC (Sat) by nix (subscriber, #2304) [Link]

GCC doesn't ignore the standard: you simply can't have data structures
that reside at address zero. Shaving off 1/2^32 or less of the address
space, and disabling an optimization in the one place that cares about
this (the kernel) does not seem like a terrible cost to me.

Data structures at address zero do not exist on any sane C platform.

A zero pointer is not a null pointer

Posted Jul 25, 2009 23:18 UTC (Sat) by PaXTeam (guest, #24616) [Link]

> Data structures at address zero do not exist on any sane C platform.

so platforms without an MMU are not sane?

A zero pointer is not a null pointer

Posted Jul 26, 2009 18:27 UTC (Sun) by nix (subscriber, #2304) [Link]

Well, you can't access a structure at address zero on such a platform
without either disabling all optimizations that involve knowing which
pointers are null (as the kernel now is) and taking great care to ensure
that you never need anything that can point to said structure to be NULL
at any time, or defining the null pointer to be other than all-bits-zero
(allowed, but weird, about as rare as platforms with strange word sizes).

I'd say that trying to access structures at address zero, MMU or no MMU,
is extremely unusual and not really sane to handle in a general-purpose
compiler. (GCC goes further than I would expect in actually having a
switch that makes it possible to use such a barmy thing.)

A zero pointer is not a null pointer

Posted Jul 25, 2009 9:37 UTC (Sat) by spitzak (guest, #4593) [Link]

That won't work, plenty of code assumes that comparing two NULL pointers for equality will return
true.

Fun with NULL pointers, part 1

Posted Jul 21, 2009 11:48 UTC (Tue) by dunlapg (guest, #57764) [Link]

Who on earth implemented the gcc "optimization"? Check-for-null-after-use is almost always a bug, isn't it? It should be throwing a warning out, not optimizing the null check away.

Fun with NULL pointers, part 1

Posted Jul 21, 2009 12:01 UTC (Tue) by johill (subscriber, #25196) [Link]

I'd think very often it's not. Imagine the NULL pointer check was inserted by a macro or an inlined function, into a function that's known to be called with a non-NULL argument, say because that argument is on the stack.

Optimizations and undefined behavior

Posted Jul 21, 2009 15:59 UTC (Tue) by amnon (guest, #43382) [Link]

This bug made me think about other scenarios where optimizations can change the meaning of the code:

The program was accessing the pointer without checking for NULL. Since the result of dereferencing a null pointer is undefined, the compiler "inferred" that the pointer is never null, and optimized away the check.

That's valid logic, but there are rare cases where code intentionally uses undefined behavior. For example, OpenSSL tries to get entropy by reading an uninitialized buffer. If at any point in the future an optimization is introduced in the compiler which assumes that uninitialized variables are never accessed, bad things could happen.

Optimizations and undefined behavior

Posted Jul 21, 2009 18:05 UTC (Tue) by BrucePerens (guest, #2510) [Link]

there are rare cases where code intentionally uses undefined behavior.

No correct cases.

Undefined really means undefined. There was never any guarantee that the value used in the OpenSSL code would provide any entropy. Undefined doesn't mean "random", "unknown", or "non-deterministic", it means "implementation dependent".

So, yes, it's difficult to get entropy without a specific service that provides entropy. But that's what you need, not just a garbage variable.

Optimizations and undefined behavior

Posted Jul 21, 2009 19:42 UTC (Tue) by martinfick (subscriber, #4455) [Link]

it means "implementation dependent".

which as others have pointed out, is the purpose of undefined in the C spec, which does not mean that it has "No correct cases.", but rather, that you'd better be familiar with the implementation you are using. It makes no sense to say: "Here is a feature not available elsewhere, that we think it valuable, but you should never use it!" does it?

Optimizations and undefined behavior

Posted Jul 21, 2009 20:04 UTC (Tue) by BrucePerens (guest, #2510) [Link]

"Implementation dependent" means that the folks who change the compiler, the C library, the kernel, and any stack-smashing detection/prevention code that you are using have the right to change the behavior at any time without documenting the particular side-effect that your code is depending upon.

Especially in the context of the various stack-smashing prevention hacks going around, you have no reason to believe that an uninitialized variable would not actually be initialized to a fixed value.

Undefined stuff is not a service that you can count on. Ever.

Optimizations and undefined behavior

Posted Jul 21, 2009 20:11 UTC (Tue) by martinfick (subscriber, #4455) [Link]

No, it does not always mean this. In this particular case it may, but it is perfectly reasonable for a specific C compiler to specify its behavior when the standard says that it is undefined, that is the point of "undefined" and "Implementation dependent".

"Implementation dependent" != `cat /dev/urandom`

Optimizations and undefined behavior

Posted Jul 21, 2009 20:55 UTC (Tue) by BrucePerens (guest, #2510) [Link]

The recent ext3 fsync() snafu is a good example of implementation dependent behavior becoming taken as an implicit guarantee. And then the developer had to reduce the scope of the promise for performance reasons. He ended up regretting that he had ever made that feature visible.

Anyway, there was no such guarantee in this case.

Optimizations and undefined behavior

Posted Jul 21, 2009 22:39 UTC (Tue) by mjg59 (subscriber, #23239) [Link]

The role of software is to be useful to its consumers. Software that fails in this will tend to end up being ignored in the long run.

Optimizations and undefined behavior

Posted Jul 22, 2009 1:28 UTC (Wed) by BrucePerens (guest, #2510) [Link]

Matt, you aren't seriously proposing that we provide some sort of user contract regarding the contents of uninitialized variables being reliable sources of entropy.

Regarding ext3, this problem came up because fsync() was implemented as a performance pig, at least in ext3, and rather than fix it we trained application developers that they'd be safe without it. Had fsync() been repaired when the mozilla problem came up, nobody would be arguing about it today.

Optimizations and undefined behavior

Posted Jul 22, 2009 1:39 UTC (Wed) by mjg59 (subscriber, #23239) [Link]

If a system left truly random numbers in uninitialised variables and applications started making use of that functionality (even if undocumented), then I think you'd need a very good reason to change that behaviour. But since I can't imagine any sane system ever doing that, no, I'm not proposing that we need a user contract on that point.

Contrast that to the ext3 behaviour. While, yes, the behaviour of fsync() on ext3 did result in people being even less likely to use it, the fact that ext3 also made it possible to overwrite a file without having to go through a cumbersome sequence of fsync()ing both the data and the directory made it attractive to application writers. That behaviour dates back far beyond Firefox 3, as demonstrated by people's long-term complaints about XFS leaving their files full of zeros after a crash. ext4 now provides the same ease of use because people made it clear that they weren't going to use it otherwise. Future Linux filesystems are effectively forced to provide the same semantics, which is a good thing.

Optimizations and undefined behavior

Posted Jul 22, 2009 3:28 UTC (Wed) by BrucePerens (guest, #2510) [Link]

Sigh. Good developers are still going to do create temp, write, fsync file, link to permanent name, unlink temp. Fsync on the directory, though, shouldn't be necessary.

Optimizations and undefined behavior

Posted Jul 22, 2009 4:16 UTC (Wed) by mjg59 (subscriber, #23239) [Link]

Good developers aren't going to have to - good operating systems will provide guarantees above and beyond POSIX. Operating systems that don't will end up poorly supported and gradually become irrelevant.

Optimizations and undefined behavior

Posted Jul 23, 2009 5:07 UTC (Thu) by johnflux (guest, #58833) [Link]

I hope not. Good developers will realize that fsync is a complete overkill there. They don't need wait for the changes to actually be made to disk before continuing - they only need to make sure that the changes happen in the right order.

Optimizations and undefined behavior

Posted Jul 30, 2009 13:38 UTC (Thu) by forthy (guest, #1525) [Link]

The problem with fsync() is that it's semantics resemble whery much the one of "PLEASE" in INTERCAL, which means it is a joke. fsync() basically has no semantics, except "make it so" (make it persistent now). Now, all file system operations are persistent, anyway, just not made persistent now. You can't properly test (that is: automated), because to test if there's a missing fsync(), you have to force an unexpected reboot, and then check if there's any missing data. What's worse: A number of popular Unix programming languages don't even have fsync(), starting with all kinds of shell scripts. fsync() is a dirty hack introduced into Unix because of broken (but extremely fast) file system implementations.

We know that Unix is a joke for quite some time, but parts of the API like fsync() show that this is not so far away from the truth ;-). From the kernel development side it is always "easier" to maintain a sloppy specification and blame the loser, but that's the wrong thinking. You are providing a service. Same thing for GCC: Compiler writers provide a service. Using a sloppy specification for questionable "optimizations" is wrong, as well. If the compiler writer can't know that the code really will break when accessing the NULL pointer, then he can't take the test out after having accessed an object. I remember GCC taking out tests like if(x+n>x), because overflows are said to be unspecified in the C language, but compiling code to a machine where overflows were very specifically handled as wraparounds in two's complement representation. This is all wrong thinking.

Optimizations and undefined behavior

Posted Jul 30, 2009 15:04 UTC (Thu) by foom (subscriber, #14868) [Link]

> I remember GCC taking out tests like if(x+n>x)

Still does. You can use -fwrapv if you want to tell it that you want signed number overflow to be
defined as wrapping.

Optimizations and undefined behavior

Posted Jul 30, 2009 21:31 UTC (Thu) by nix (subscriber, #2304) [Link]

fsync(), brought to you by the same people who thought up 'volatile',
another equally impossible-to-define-except-by-reference-to-implementation
feature.

Optimizations and undefined behavior

Posted Jul 30, 2009 13:31 UTC (Thu) by lysse (guest, #3190) [Link]

> "Implementation dependent" != `cat /dev/urandom`

Wasn't that where Bruce came in...? ;)

Optimizations and undefined behavior

Posted Jul 31, 2009 14:28 UTC (Fri) by hozelda (guest, #19341) [Link]

I know the conversation is partly about what developers should depend on or not, but keep in mind that "undefined behavior" and "implementation-defined behavior" mean something precise and completely different from each other in C99. See Chapter 3 (and 4) and informative Appendix J http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1336.pdf

Fun with NULL pointers, part 1

Posted Jul 21, 2009 19:14 UTC (Tue) by zeekec (subscriber, #2414) [Link]

Nasal demons!

#pragma and GCC

Posted Jul 21, 2009 21:48 UTC (Tue) by BrucePerens (guest, #2510) [Link]

Some time in the 80's, when RMS was still coding GCC, the response to #pragma in C was explicitly stated to be undefined, so that your compiler could have whatever magic happen that it wished.

So, RMS coded GCC to start the game "rogue" when it saw #pragma.

#pragma and GCC

Posted Jul 22, 2009 15:46 UTC (Wed) by zeekec (subscriber, #2414) [Link]

Wouldn't that be a great response to a NULL dereference in the kernel.