Structure holes and information leaks

This article brought to you by LWN subscribers

Subscribers to LWN.net made this article — and everything that surrounds it — possible. If you appreciate our content, please buy a subscription and make the next set of articles possible.

By Jonathan Corbet
December 1, 2010

Many of the kernel security vulnerabilities reported are information leaks - passing the contents of uninitialized memory back to user space. These leaks are not normally seen to be severe problems, but the potential for trouble always exists. An attacker may be able to find a sequence of operations which puts useful information (a cryptographic key, perhaps) into a place where the kernel will leak it. So information leaks should be avoided, and they are routinely fixed when they are found.

Many information leaks are caused by uninitialized structure members. It can be easy to forget to assign to all members in all paths, or, possibly, the form of the structure might change over time. One way to avoid that possibility is to use something like memset() to clear the entire structure at the outset. Kernel code uses memset() in many places, but there are places where that is seen as an expensive and unnecessary call; why clear a bunch of memory which will be assigned to anyway?

One way of combining operations is with a structure initialization like:

    struct foo {
        int bar, baz;
    } f = {
    	.bar = 1,
    };

In this case, the baz field will be implicitly set to zero. This kind of declaration should ensure that there will be no information leaks involving this structure. Or maybe not. Consider this structure instead:

    struct holy_foo {
	short bar;
	long baz;
    };

On a 32-bit system, this structure likely contains a two-byte hole between the two members. It turns out that the C standard does not require the compiler to initialize holes; it also turns out that GCC duly leaves them uninitialized. So, unless one knows that a given structure cannot have any holes on any relevant architecture, structure initializations are not a reliable way of avoiding uninitialized data.

There has been some talk of asking the GCC developers to change their behavior and initialize holes, but, as Andrew Morton pointed out, that would not help for at least the next five years, given that older compilers would still be in use. So it seems that there is no real alternative to memset() when initializing structures which will be passed to user space.

Index entries for this article
Kernel	Security/Vulnerabilities

(Log in to post comments)

What are the timing tradeoffs?

Posted Dec 2, 2010 3:14 UTC (Thu) by felixfix (subscriber, #242) [Link]

Way back when I was doing kernel work, custom 8 and 16 bitters, you could relatively easily figure timings to zap the whole structure vs adding a few individual zaps among the actual (non-zero) data inits, and compare them. I haven't programmed these new fangled pipeliney and cachey processors like that, but I know the timings are devilishly tricky.

Is it nevertheless possible to come up with crude comparisons, say that zapping a 32 byte struct takes 10 cycles and individual zeroes take 1 cycle each?

What are the timing tradeoffs?

Posted Dec 2, 2010 8:34 UTC (Thu) by exadon (guest, #5324) [Link]

I don't think cycle counting is a useful metric here. Today it all depends on memory access time. As long as the cache is hot, memset of small structures should be basically free. And the cache is hot if we plan to assign individual members afterwards. Has anybody ever found a case where an unnecessary memset of a small structure causes a measurable difference in runtime?

What are the timing tradeoffs?

Posted Dec 2, 2010 22:16 UTC (Thu) by wahern (subscriber, #37304) [Link]

I once improved the performance by 20% of a multimedia reverse proxy merely by replacing 0-initializing loops with calls to memset. If I could have gotten rid of the memsets altogether perhaps performance might have been ever better as profiling showed the issue was primarily memory bandwidth and latency.

(The whole stack was fundamentally inefficient--far too much data copying of too small buffers--but worked flawlessly and satisfactorily, especially after tweaking.)

Structure holes and information leaks

Posted Dec 2, 2010 5:50 UTC (Thu) by JoeBuck (subscriber, #2330) [Link]

For an automatic structure, no field is initialized unless explicitly assigned to. Where would gcc be expected to insert the assignment to zero in the extra bytes? How would the programmer control it? And when everyone's code slows down and the users turn the feature off, then what? We're back to where we started.

I think that the better way to address this issue (in places where it is an issue) is to design the structures so that they will not have any padding (on either a 32-bit or a 64-bit system). That way, the kernel code will have full control of all of the storage. In the example from the article, add an extra "short" field.

Structure holes and information leaks

Posted Dec 2, 2010 9:52 UTC (Thu) by ajb (guest, #9694) [Link]

For structures containing only simple elements, it is relatively easy to avoid padding. You simply list all the longs before ints, ints before shorts, etc.

What makes it more tricky is if you have structs nested in other structs.
If you add a long to a struct containing only shorts, suddenly it needs to be moved in all the structs which contain instances of it.

Structure holes and information leaks

Posted Dec 2, 2010 15:37 UTC (Thu) by dgm (subscriber, #49227) [Link]

Adding explicit padding is an awful solution, because it is based on assumptions (struct members are padded to 4 byte boundaries) that may or may not hold in all of today's -and much less tomorrow's- systems.

The two only viable options I can see are using compiler directives to control packing of structures that go to userspace (tricky, it changes ABI) or use memset.

Structure holes and information leaks

Posted Dec 10, 2010 7:27 UTC (Fri) by kevinm (guest, #69913) [Link]

This is not true. In C, the rule is that objects are *never* partially initialised. If you have an initialiser for one member of a struct (or one element of an array), then all the members of the struct or elements of the array are initialised to the relevant form of zero.

It does not matter if the struct is of automatic storage duration, the rule is the same.

Structure holes and information leaks

Posted Dec 10, 2010 9:52 UTC (Fri) by etienne (guest, #25256) [Link]

The real problem is IMHO not defined by the C standard (even if I am not a specialist of those standard).
It is how to consider a structure, is it a first class citizen (a new type with a size like an integer), or simply a collection of fields.
In the former case, you have to treat it like a real type and initialise even its holes - in the later case you can simply initialise each of its fields.
I think there is a better example of the problem, when using the "volatile" attribute:
volatile struct {char a,b,c,d; } my_variable;
The question is then, when you try to read "my_variable.a", do you just do an 8 bits access (because the structure is declared volatile each of its field is volatile) - or is the compiler forced to do a 32 bits access (because the structure is itself volatile, you have to read the whole structure once and only once) and then extract the right 8 bits value?
I would prefer the later case, but GCC implements the former one.

Structure holes and information leaks

Posted Dec 10, 2010 10:33 UTC (Fri) by etienne (guest, #25256) [Link]

> In the example from the article, add an extra "short" field.

And in a new version, GCC optimiser will be better and dectect that this field is initialised but not used, so it will decide not do do the initialisation at all - the same as when you declare an automatic variable, initialise it, but never use it.

Structure holes and information leaks

Posted Dec 10, 2010 14:30 UTC (Fri) by foom (subscriber, #14868) [Link]

But if it did do that, that'd be fine. That field *wouldn't* be known-to-be-unused if you passed the address of the struct to another function that might access the field, or memcpy it to another buffer, or cast it to a char* and send it over a wire...

And if you really are just making a struct on the stack, and never use its address, it's perfectly right to just turn it into a set of automatic local variables.

Structure holes and information leaks

Posted Dec 2, 2010 19:53 UTC (Thu) by wingo (guest, #26929) [Link]

Fascinating article, thanks! Hadn't thought about what was in the holes before.

Structure holes and information leaks

Posted Dec 3, 2010 16:22 UTC (Fri) by cesarb (subscriber, #6266) [Link]

I wonder if there is a way to use Coccinelle to find all places where a structure is copied in some way to user space (looking for put_user and variants perhaps?) without memset being previously called on it.

Then it would be a Simple Matter Of Programming to create a script to extract the definitions for these structures and check them for padding (since optimized structures without padding have no need for a memset).

Structure holes and information leaks

Posted Dec 3, 2010 20:39 UTC (Fri) by speedster1 (guest, #8143) [Link]

> Then it would be a Simple Matter Of Programming to create a script to
> extract the definitions for these structures and check them for padding
> (since optimized structures without padding have no need for a memset).

Actually padding can change according to architecture and even toolchain used to compile the code, but there is a tool 'pahole' that shows whether a structure ended up with holes after it has been compiled for a particular platform:

http://lwn.net/Articles/206805/

Structure holes and information leaks

Posted Dec 11, 2010 16:50 UTC (Sat) by RogerOdle (subscriber, #60791) [Link]

As I see it, there are diametrically opposed purposes for the kernel: security verses performance. If your purpose is to run a server then security is your primary concern and you should do the utmost to prevent information leaks when you can, even at the cost of performance. If your purpose is to perform some complex analysis or you need really fast IO then performance is your primary concern. In the second case, security may not be a particular issue if your application is not exposed to the Internet.

It may be desirable to have a choice to build the kernel with a security policy that zeros all allocated memory or to build the kernel with a performance policy. If such a choice is available then it would be necessary to have a mechanism so that servers could determine at run-time if the appropriate security policies are in place in the kernel.

This choice would not be appropriate as a run-time switch where the kernel would be constantly checking for whether it should zero memory or not. It would work better as a compile time choice.