|
|
Subscribe / Log in / New account

ioctl() forever?

By Jake Edge
June 8, 2022
LSFMM

In a combined storage and filesystem session at the 2022 Linux Storage, Filesystem, Memory-management and BPF Summit (LSFMM), Luis Chamberlain and James Bottomley led a discussion about the use of ioctl() as a mechanism for configuration. There are plenty of downsides to the use of ioctl() commands, and alternatives exist, but in general kernel developers have chosen to continue using this multiplexing system call. While there is interest in changing things, at least in some quarters, the discussion did not seem to indicate major changes on the horizon.

Problems

Chamberlain began with a history lesson with "some rants" thrown in. ioctl() is still used a lot by filesystems and the block layer in Linux, but the wireless-networking subsystem, which he used to work on, successfully shifted away from ioctl(). The system call "wasn't really originally designed for what we think it was", he said; it is "essentially a hack". In Douglas McIlroy's history of Unix, it was called "a closet full of skeletons" that was mainly used to prevent the addition of too many new system calls. Those things should be kept in mind when thinking about ioctl(), he said.

[James Bottomley]

The first version of Linux did not have ioctl(); it was added in Linux-0.96a in May 1992. A small patch in 1993 changed a type from unsigned int to unsigned long, which eventually led to compatibility headaches for 32-bit ioctl() calls issued on 64-bit systems. The Unix idea that everything is a file is useful because it allows for flexibility, but it also allows for lazy API design, he said. Beyond that, ioctl() commands are not well documented and the interface does not allow for introspection

Lack of introspection abilities is not a problem that Chamberlain has encountered directly, so he asked Bottomley to elaborate on that. The problem crops up in the container world, Bottomley said, and it is not just for ioctl() commands but also introspection for system calls. For example, securing a Docker container by limiting the system calls it can make does not really secure anything if there are opaque ioctl() commands that can be used to circumvent the restrictions. So there is a lot of concern about non-introspective interfaces because they "can't be policed properly by the tools we usually use for containers, like seccomp and even eBPF".

The specific problem with ioctl() is that there is a "dense binary packet" that gets passed into the call, which makes it difficult for external tools to deduce what the packet contains. In theory, the kernel could switch to using XML or JSON, but that does not really change the underlying problem much, he said. The introspection problem remains "almost regardless of which interface we choose".

There are other problems with ioctl(), Chamberlain said. For example, he asked Arnd Bergmann about ioctl() support for different architectures. He got back an itemized list of caveats. "The world is not peachy for architecture support as well".

Greener grass

The Linux wireless-networking configuration underwent a shift from the ioctl()-based wireless extensions to the netlink-based nl80211 interface. Chamberlain invited attendees to compare include/uapi/linux/wireless.h with include/uapi/linux/nl80211.h to see how much cleaner the new interface is. The netlink interface is not designed to be generic, so it may not be the right choice for filesystems and the block layer. But he is sure that that it is possible to find something better than the ioctl()-based interface we have now.

Chamberlain handed the microphone back to Bottomley so that he could talk about configfd as a possibility. But Bottomley said that he was not going to promote configfd, though he did describe it a bit. It came out of his efforts on the shiftfs filesystem, which was eventually supplanted by ID-mapped mounts. Configfd was based on the fsconfig() system call, which allows setting a bunch of configuration information on a filesystem atomically, but configfd was bidirectional. David Howells, who developed fsconfig() and the related new mounting API, interjected that fsconfig() was originally bidirectional as well, though Al Viro removed that piece before it was merged.

Instead of defending configfd, Bottomley said, he wanted to talk about the necessity of ioctl(). When there is a need for "an exception to the normal semantic order of things", an ioctl() command can provide it. And there will always be a need for exceptions, no matter how tightly regulated that semantic order is. There will always be a requirement that two parties be able to communicate data that cannot be structured using the existing mechanisms—an exception. Whether that data is sent as JSON, XML, or binary data, it is, effectively, an ioctl().

The introspection problem is real, but is one that he thinks could be handled with documentation. Christian Brauner said that the problem goes beyond just ioctl(); there are a number of different problems with seccomp() filtering because of the need to inspect the system call arguments to help make filtering decisions. Pointer arguments have been discouraged for new system calls because seccomp() cannot follow the pointers. But using pointers to structures is a technique for creating extensible system calls, so seccomp() also needs to change. It is a problem "slightly to the side" of the ioctl() problem, but it needs to be solved as well, he said.

Bottomley said that this shows that even if it were decreed that ioctl() commands should all move to new system calls, the problem with introspection would just move with it. Ted Ts'o said that kernel developers rightly keep a tight grip on new system calls and their interfaces. So adding a new system call involves an enormous bikeshedding exercise with lots of additional requirements, including documentation and working with features like seccomp() filtering. Often, the feature developer does not care about the container use case, even if they should, so they move it to an ioctl() command "so they can dodge the bikeshedding".

The more perfect the kernel community tries to make the system call interface, the more incentive there is for developers to route around it, Ts'o said. He has heard people talk about adding a feature via a filesystem-specific ioctl() command as a way to avoid the "fsdevel bikeshed party". That is unfortunate, since there is plenty of useful architectural review that might come with trying to make the feature more widely usable, but it is understandable that people take the expedient approach. No one has infinite resources, Ts'o said.

Alternatives

Josef Bacik asked what the alternative is. "You're going to pry ioctl()s from my cold dead hands unless you give me something else." The Btrfs developers have "wasted a lot of time" in grand architectural discussions that ended up with the community saying that a feature should just be put into an ioctl() command—after a year of discussion. Bottomley said that he would argue ioctl() commands, used judiciously, are just fine.

Kent Overstreet said that ioctl() commands are simply a driver-specific system call; there is a real need for that. It provides a mechanism to try out a feature in a more private way before it gets promoted to a system call, where it becomes permanent. Amir Goldstein agreed, noting that the "chattr" ioctl() command was implemented by two different filesystems before it was determined to be a generally useful feature and moved into the virtual filesystem (VFS) layer.

There are multiple existing mechanisms for configuration in the kernel, Ts'o said. ioctl() commands are just system calls in disguise, both of which provide ways to do configuration, but procfs and sysfs files can also be used for that. Beyond those, the new mount API or configfd provide other configuration mechanisms. But which gets used depends in part on how much pain there is in trying to change the mechanism for a new task, he said. If the pain of adding ioctl() commands rises to the same level as for system calls, developers will simply find a "different escape hatch".

But Chamberlain said that adding new wireless commands did not require additional system calls or ioctl() commands because it uses netlink. Those changes can be made in a domain-specific place without all of the problems that come from the other mechanisms. Brauner said that he had a hard time seeing what could replace the ioctl() interface, however; he wondered if Chamberlain was suggesting switching to something netlink-based. Chamberlain said that it was just one idea, but Howells noted that netlink could not be used because it depends on networking being configured into the kernel, which is not always the case.

There was some discussion of alternatives, but it is clear that ioctl() itself is not going away and that it fills a need. Finding ways to make the ioctl() arguments more introspectable would be useful, as would better documentation. But if requiring those things causes the friction level for adding new commands to rise too much, it will have the opposite of its intended effect. No real solution seemed to be forthcoming from the discussion, though no one seems entirely satisfied with the status quo either.


Index entries for this article
Kernelioctl()
ConferenceStorage Filesystem & Memory Management/2022


(Log in to post comments)

ioctl() forever?

Posted Jun 8, 2022 14:48 UTC (Wed) by rincebrain (subscriber, #69638) [Link]

People might also use ioctl because they're developers on something out of tree, the existing syscalls and ioctls have artificial limitations, and LKML has been actively hostile to anyone suggesting anything happen or not to improve things for projects which are unlikely to ever be mainlinable.

So, since you can't easily add custom syscalls to everyone ever, ioctl it is.

ioctl() forever?

Posted Jun 18, 2022 21:33 UTC (Sat) by developer122 (guest, #152928) [Link]

There is a stunning level of hostility, regardless of licence.

ioctl() forever?

Posted Jun 8, 2022 15:18 UTC (Wed) by josh (subscriber, #17465) [Link]

One other issue with ioctl: it can mean different things to different devices, and there's no guarantee of non-overlap (even though the kernel does attempt to avoid it).

So if a filter mechanism needs to allow a particular ioctl, used with a particular device, it might also be allowing other unrelated and problematic ioctls on other devices that use the same ioctl number.

ioctl() forever?

Posted Jun 8, 2022 15:52 UTC (Wed) by tau (subscriber, #79651) [Link]

There is also a tendency to crowbar a kernel's API semantics into the language of a virtual filesystem, with virtual directories containing virtual text files. A compromise between machine readable and human friendly while being a poor fit for both.

For relatively high-overhead and semantically complicated "system calls" I think that it would be nice if there was some sort of unified standard for user space and kernel space message passing that had a strong emphasis on message schemas and extensibility. Something like dbus. Or gRPC without all of the horrible HTTP/2 trappings. Or something with a design more suited to asynchronous communication like Wayland. Good luck getting a useful number of people to agree on what such a protocol should look like though. Of course it is very easy to say "it would be nice if" on a message board.

ioctl() forever?

Posted Jun 8, 2022 18:34 UTC (Wed) by jhoblitt (subscriber, #77733) [Link]

A solution similar to gRPC doesn't sound unreasonable. Protobufs, as an API specification, are much easier to machine process than the entirety of the C language. Protobufs also support schema evolution.

ioctl() forever?

Posted Jun 8, 2022 20:07 UTC (Wed) by wahern (subscriber, #37304) [Link]

> Protobufs, as an API specification, are much easier to machine process than the entirety of the C language.

With CTF/BTF symbols, sticking to C would be easier and simpler for all involved, especially when you begin to look at what would be the most difficult part of structured, compound, nested data type filtering--actual filter specifications and execution. The obvious solution is a BPF program, which is precisely why BTF symbols were added to the kernel--so BPF programs could introspect kernel data types.

ioctl() forever?

Posted Jun 8, 2022 20:25 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]

Even with CTF you still have problems with strings.

ioctl() forever?

Posted Jun 9, 2022 13:37 UTC (Thu) by nix (subscriber, #2304) [Link]

Yeah, but that's neither CTF nor BTF's fault: that's the same problem with dereferencing pointers (and finding a way to do that while not allowing attackers to modify the thing and sneak bad stuff past you via TOCTTOU races, *without* incurring horrible overhead by forcing every such pointed-to thing to be CoWed on modification).

ioctl() forever?

Posted Jun 9, 2022 18:44 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

That's why just marshalling all the data into variable-length structures that can be simply copied into the kernel space is the right way to do it. Yes, you have some overhead duet to packing and copying, but both are pretty efficient.

Also, ioctl()s shouldn't be performance-critical anyway.

ioctl() forever?

Posted Jun 9, 2022 23:14 UTC (Thu) by wahern (subscriber, #37304) [Link]

With CTF symbols you could easily write a simple function to recursively copy a data structure into kernel space. So you end up in the same place, but with a fraction of the complexity and code. For C string members you can trivially implement constraint checking during the copy. Similarly, for dependent pointer and size members you can include simple qualifiers in the structure definition that the copy routine can validate. (There was a proposal, effectively implementing a simple dependent type system for this case, which came close to passing muster for the next C2x revision. It seems to have failed for lack of real-world implementation experience, but strictly speaking it should be quite simple to implement.)

Alternatively, you could implement a type-safe unpacking API. (If you only care about memory safety, you might not even need CTF.) For example, "give me the next member, which I expect to be a NUL-terminated string". This may be more or less ergonomic than the above, but either way would provide the same effective interface as generic deserializers--even "zero copy" serialization formats typically cannot be exposed as plain C data structures, even when the serialization and deserialization code is generated a la Cap'n Proto. The difference between `string_t getNamedField(object_t)` and `string_t copy_in_string(userstruct_t *, cursor_t *)` is purely syntactical when you can't completely trust the input to be well formed, which we can't.

In any event, simple intermediations, with or without CTF, provide avenues to enjoying all the same safety guarantees, without creating both kernel and userland dependencies on enormous (10-100+k SLoC) libraries. Microkernels tend to heavily rely on serialized message passing and/or RPC code generators. I'm sure they work well, but there's a steep upfront cost, especially when it comes to tooling.

Maybe even a simpler alternative is it avoid variable length strings entirely. Briefly looking at existing ioctls, it seemed like most string fields within structures are fixed-sized arrays. Variable length strings seemed more likely to be passed directly as an argument. (But please feel free to set me straight on that account as I didn't look very hard and certainly didn't write any tools to analyze the types.) To the extent the latter are a problem, they could be fixed with a very thin intermediating interface. There's something to be said for simply using fixed-sized arrays. During the heyday of writing GNU replacements for proprietary tools, there was an emphasis on removing arbitrary limitations and permitting every conceivable type of input to be variable length and unbounded; that emphasis became pervasive and reflected in even trivial, internal interfaces removed from user input. That's rarely needed, and rarely warranted given the resulting complexity, *especially* for kernel interfaces. Even file paths (or more specifically, file path arguments) have a fixed upper bound in the Linux kernel. Recently I discovered this was true even in Solaris, despite Solaris have a complicated kernel syscall facility for unbounded-length input and output syscall arguments. If you step away from trying to making everything configurable and unbounded, then it becomes much easier to limit complexity. Usually you can set an upper bound that is good enough and move on; and for many of the exceptions, you can switch to semantics that let you trade time for space (i.e. trade serial processing for discrete buffers). Or to put it another way, for the rare cases where setting fixed-bound arguments is too cumbersome, use netlink instead of ioctl. Problem solved. One doesn't need to become the other, or replaced with something fancier; just make the choice more clear.

If you want to make it easier to prevent people from accidentally doing the wrong thing, and to identify places where that might be likely, you can add type annotations to structures and other types used by kernel ioctl interfaces, complemented with a GCC module pass that identifies code that directly access members. IOW, you can implement something akin to Rust's type checker that prevents normal code from reading pointers in an unstructured manner, forcing that code to go through a "safe" API.

ioctl() forever?

Posted Jun 8, 2022 18:41 UTC (Wed) by atnot (subscriber, #124910) [Link]

> There is also a tendency to crowbar a kernel's API semantics into the language of a virtual filesystem, with virtual directories containing virtual text files

I don't remember who said this, but the lesson should not have been "everything is a file" being good, but "everything is a file descriptor" being good. There is great value in having a unified way of representing kernel resources and associated access capabilities, one that can easily be shared between subsystems and passed between processes. But the part where you usually need to obtain those file descriptors by painfully funneling everything through hundreds of snprintf() and open() calls on hardcoded paths isn't actually that valuable. In fact it often only undermines those valuable properties.

ioctl() forever?

Posted Jun 9, 2022 0:53 UTC (Thu) by NYKevin (subscriber, #129325) [Link]

Meh, snprintf isn't the kernel's fault, it's C's fault for being terrible at finagling strings. Stringly-typed interfaces are bad because they are difficult or impossible to validate, not because snprintf is painful.

ioctl() forever?

Posted Jun 9, 2022 1:35 UTC (Thu) by atnot (subscriber, #124910) [Link]

This wasn't at all meant to be a dig at snprintf(), feel free to insert your preferred mechanism. I just don't think string formatting is something that should be regularly involved at the very core of OS APIs at all, even if C's string handling wasn't as uniquely terrible as it is. Although that fact doesn't help of course.

ioctl() forever?

Posted Jun 8, 2022 18:38 UTC (Wed) by jhoblitt (subscriber, #77733) [Link]

What is the functional difference between an ioctl() and an undocumented / unreviewed system call?

ioctl() forever?

Posted Jun 9, 2022 13:33 UTC (Thu) by khim (subscriber, #9252) [Link]

Ability to avoid clashes. Even if out-of-tree device uses the same number as another out-of-tree device they don't affect each other because, well, they are different devices.

Syscall numbers are globals, on the other hand.

ioctl() forever?

Posted Jun 8, 2022 19:12 UTC (Wed) by koverstreet (subscriber, #4296) [Link]

Here's my ioctl v2 proposal that came out of that discussion - something lightweight and minimal that would make ioctls more like normal function/syscalls:

https://lore.kernel.org/lkml/20220520161652.rmhqlvwvfrvsk...

ioctl() forever?

Posted Jun 8, 2022 20:32 UTC (Wed) by roc (subscriber, #30627) [Link]

For rr we really want to be able to determine, from the parameters and results of a syscall, all the userspace memory locations that a syscall writes to. Other tools (e.g. fuzzers) need the memory locations read as well. This is particularly hard for ioctls. The current ioctl op format is supposed to tell us the size of the data pointed to by the pointer, and whether it's read or written or both, but this information is often incomplete or plain wrong. In the worst cases, the data in that struct is just the root of some elaborate data structure with many other userspace pointers in it. So it would be great if an ioctlv2 design solves this. One way would be to provide metadata allowing us to parse the layout *and making the correctness of that metadata testable by automated tests*. Another way would just be to require that the data be a single contiguous buffer.

ioctl() forever?

Posted Jun 8, 2022 21:24 UTC (Wed) by koverstreet (subscriber, #4296) [Link]

hey, I was just trying rr for my first time today!

I'm not sure how practical it'll be to solve this problem in general. My approach was to try and make ioctls more like syscalls - i.e. something that looks like a normal function call. That would make the simple cases simpler, by getting rid of ioctl structs entirely for the simple cases.

But this won't do anything for the more complex cases you brought up. The problem in those cases is just that C is a very low level language, and anything that's complex enough to handle those cases isn't going to feel natural and simple in C. This is my complaint with those advocating netlink as a wholesale replacement for ioctls - it's more complicated than it needs to be for the simple cases. But for the more complicated cases maybe it's the right approach.

There's another possibility that just occurred to me because I saw you were using it today - cap'n proto. If the problem is that defining complex data structures in a portable, ABI independent way sucks, this is exactly what cap'n proto is intended for. When I last looked at it the story for using it from C seemed incomplete, but maybe this has changed - might be worth another look.

For those unfamiliar: cap'n proto [1] is a schema language for defining ABI independent types. Crucially, unlike things like protobufs, it doesn't have pack/unpack operations - the wire format is the in memory format. It's like defining types in C using only standard sized integers, only without all the razor sharp edges and some really useful features.

1: https://capnproto.org/

ioctl() forever?

Posted Jun 8, 2022 21:32 UTC (Wed) by abatters (✭ supporter ✭, #6932) [Link]

In case it's useful, here is a concrete example of a complex ioctl:

https://sg.danny.cz/sg/sg_v40.html

The sg driver has ioctls for sending generic SCSI commands, so it supports read-type, write-type, and bidirectional commands, direct I/O, pointers to arrays of iovecs of userspace buffers like readv/writev, pointers to buffers to contain error information, async I/O, and lots of other complex features, and it has to be high-performance. It's sort of like io_submit() and io_getevents() for SCSI commands in an ioctl().

ioctl() forever?

Posted Jun 9, 2022 5:43 UTC (Thu) by wahern (subscriber, #37304) [Link]

Unix has had a standardized serialization format for decades: XDR. https://en.wikipedia.org/wiki/External_Data_Representation
An encoder/decoder has been in the Linux source tree since almost the beginning: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/...

If better serialization formats sufficiently resolved these issues then presumably ioctl would already be gone, replaced by something using XDR or one of the other dozen similar formats that have come and gone over the years. The marginal benefit of these formats over plain C data structures is manifestly not that great, at least for local IPC.

ioctl() forever?

Posted Jun 9, 2022 6:39 UTC (Thu) by roc (subscriber, #30627) [Link]

Many ioctls were invented before rr or fuzzers or security mattered, and then all the other ioctls were added because that's just how things are done.

ioctl() forever?

Posted Jun 9, 2022 6:55 UTC (Thu) by pm215 (subscriber, #98099) [Link]

I think a lot of the problem is that the kernel refuses to define its ABI in any other way than "here's a pile of header files, and if you're lucky also some documentation". This is a pain for any case except "I'm a C program". ioctl isn't the only offender here either -- setsockopt is another "feed arbitrary data structures via a generic-looking interface" mechanism.

QEMU's user-mode emulation runs into trouble with these things because we have to convert from guest architecture struct layout and endianness to host layout and endianness; without a machine-readable definition of what's being passed around by ioctl and similar syscalls we have to hand-roll support for every new ioctl somebody cares about. And every so often we run into one that's just straight-up not even documented.

ioctl() forever?

Posted Jun 9, 2022 7:33 UTC (Thu) by roc (subscriber, #30627) [Link]

We really should come up with a shared description of the syscall ABI that can be used by QEMU, rr, fuzzers, ASAN, Valgrind, etc. I think we're all duplicating work.

ioctl() forever?

Posted Jun 9, 2022 7:35 UTC (Thu) by roc (subscriber, #30627) [Link]

Syzkaller has an abstract description of syscalls: https://github.com/google/syzkaller/blob/master/docs/sysc...
It's fairly fuzzer-specific of course.

ioctl() forever?

Posted Jun 9, 2022 11:38 UTC (Thu) by adobriyan (subscriber, #30858) [Link]

And ship it with kernel: mount -t abi ... !

ioctl() forever?

Posted Jun 9, 2022 15:11 UTC (Thu) by ejr (subscriber, #51652) [Link]

To paraphrase the bard, "That way [formal methods] lie; let me shun that."

ioctl() forever?

Posted Jun 12, 2022 0:16 UTC (Sun) by developer122 (guest, #152928) [Link]

Such is the pervading wisdom in software.

It was yesterday I was commented that modern security practice is to layer incomplete speedbumps one by one and call it "defense in depth." This was in response to new bypass of the M1's hardware pointer checking, a "last line of defense."

None of these half-baked measures qualify as defense in depth. Anyone who calls it that is kidding themselves. They're deployed one by one with large time spans inbetween, reactively to whatever attackers currently favor. It should come as no surprise that attackers have no problem paying a low continual cost to work around each new measure, in return for continued ability to exploit hosts.

Meanwhile, we plaster over the undefined behavior that makes C/C++ static analysis impractical. We don't even talk about how CPUs work. We continue to ignore that actual exploit *theory* exists, like the Weird Machine whose wikipedia page starts: "Exploits exist empirically, but were not studied from a theoretical perspective prior to the emergence of the framework of weird machines." This makes it impossible to analyze systems holistically, and to find and shut out entire classes of exploits.

Civil engineering eventually got past "I think we can cut a small hole in this wall" and reached "lets calculate the loads and strains." Electrical engineering stopped electrocuting frogs and developed Maxwell's laws. Right now computing is still far behind all others and until formal methods and theory take over, we won't ever make statements like "this bridge will stand for 100 years."

ioctl() forever?

Posted Jun 12, 2022 1:15 UTC (Sun) by Cyberax (✭ supporter ✭, #52523) [Link]

> Electrical engineering stopped electrocuting frogs and developed Maxwell's laws.

Open the NEC and look for any mentions of Maxwell's equations. If anything, modern safety engineering is built on installing as many "speedbumps" as possible, rather than on making everything perfectly safe.

ioctl() forever?

Posted Jun 12, 2022 9:01 UTC (Sun) by farnz (subscriber, #17727) [Link]

The NEC (and its international equivalents) are guides to safe installation, not to electrical engineering. It says that if you purchase products that have been engineered to a suitable standard, and then install them following this guide, you will be safe enough.

The components you install when you buy the things that the NEC requires are engineered with things like Maxwell's laws as guidance - a GFCI or an AFCI or an ELCB or an overcurrent breaker, or even a simple switch is not just empirically designed to work as specified, but rather designed around the known laws of physics and then tested against the specification.

It's just that we need the NEC rules because the possible range of components is not as wide as we'd like - and thus we need to allow for known physical issues when we install. For example, we need overcurrent breakers because wires have resistance and thus heat up when carrying current; a significant chunk of the NEC is describing the different ways you can install wires that meet a given standard, then place a breaker on the source end so that the wire cannot heat up enough to start a fire.

ioctl() forever?

Posted Jun 12, 2022 15:01 UTC (Sun) by ejr (subscriber, #51652) [Link]

The main problem is that the market for electrical engineers actually *designing* these things is relatively static or slowly growing. Meanwhile, companies are absolutely desperate to fill programming seats.

And if it ain't Python or Matlab/Octave, many students don't know it until maybe their third year. Maybe. If type checking is not mentally established, the core of current formal methods that include interactive proofs is going nowhere. (For the audience: Making a program compile without type errors *is* working with an interactive proof system called the compiler. The correct program is a proof of its being well-typed given the language's rules / axioms.)

I also suspect you underestimate how often "throw it at the wall and see if it sticks" applies even in engineering fields. Or more appropriately "this worked for something kinda similar, so it'll work here."

I recently learned of a wonderful text that can illustrate the kind of knowledge needed on the practical hardware side: D. M. Russinoff, Formal Verification of Floating-Point Hardware Design, https://doi.org/10.1007/978-3-030-87181-9_3 . It *appears* to be available gratis; I don't think I was redirected through an institutional subscription.

Software potentially could be simpler in some areas, but the OS/device level requires a similar level of skill in finding the right model. That's a polite way of saying the kind of infinite bike-shedding that was happening with btrfs. If some research group can hit it out of the park like the formalization of RISC-V, sure, but people with product-like deadlines don't have that time. (I've been on both sides.)

ioctl() forever?

Posted Jun 12, 2022 15:37 UTC (Sun) by mpr22 (subscriber, #60784) [Link]

You were in fact directed there by an institutional subscription :(

For me, that link goes to a website run by Springer Nature, where I am informed that, as a private individual in the United Kingdom of Great Britain and Northern Ireland, I would have to pay £19.95 for a PDF of the Logical Operations chapter (pp 35-44) of ISBN 978-3-030-87181-9, £95.50 for the whole of ISBN 978-3-030-87181-9 as an eBook, or £119.99 for the whole of ISBN 978-3-030-87181-9 as a physical hardcover book (all prices inclusive of applicable VAT).

ioctl() forever?

Posted Jun 13, 2022 21:57 UTC (Mon) by ejr (subscriber, #51652) [Link]

AUGH. Drat. Sorry! I was hoping, and I didn't see how I was already logged in. There is no RTL in the book at all but rather proofs based around integers. That's kinda how I think of circuits, so I like it.

ioctl() forever?

Posted Jun 14, 2022 12:54 UTC (Tue) by paulj (subscriber, #341) [Link]

It is archived by sci-hub.se, depending on what you think of that kind of thing.

ioctl() forever?

Posted Jun 12, 2022 16:06 UTC (Sun) by farnz (subscriber, #17727) [Link]

And underlying that is that electrical safety is a mature and well-understood field. The physics involved in a GFCI or an overcurrent breaker have been well-understood since the 19th century, and the physics required for an AFCI were fully settled before World War I in 1914. There's nothing in any NEC-compliant installation that a good physicist from 1920 couldn't fully understand and explain - although they'd be amazed by the manufacturing techniques involved (and would be seriously shocked by the IC in an AFCI, even though they could explain the physical principles that underpin its operation).

In contrast, logic design (both software and hardware) is a rapidly evolving field even today - Rust's type system is based atop affine logic from the 1970s, and there's mathematics I'm aware of that's probably relevant to programming language design that was only formulated rigorously in the last 20 years, and where there's active leading-edge research trying to determine if it's useful, and if so, how.

I'd also agree on the "throw it at the wall and see if it sticks" thing - the only time, IME, that engineers actually seriously bother with the rigorous analysis that's possible for something like a GFCI is when you're cost-optimizing it. Otherwise, it's considered too much work, when you can take a design that's probably overkill but meets requirements and ship it.

ioctl() forever?

Posted Jun 9, 2022 17:33 UTC (Thu) by nix (subscriber, #2304) [Link]

This sort of thing is why CTF was added to the toolchain :) abigail has support for it now, too... of course ioctls don't all specify which structures they manipulate, but for those that *do*...

ioctl() forever?

Posted Jun 11, 2022 7:50 UTC (Sat) by pm215 (subscriber, #98099) [Link]

How does CTF help here? I read https://lwn.net/Articles/795384/ which seems to describe it as basically a more compact debug info format, which leaves me unsure how it would be useful for describing the kernel ABI in a more machine-readable way. I guess in theory you could build the whole kernel and then fish out the debug info, but that would take forever and only have the info for the specific binary that got built...

ioctl() forever?

Posted Jun 9, 2022 11:02 UTC (Thu) by jengelh (subscriber, #33263) [Link]

>that netlink could not be used because it depends on networking being configured into the kernel

Perhaps when someone ponders its use for a component, what they really had in mind was just the serialization format, without the AF_NETLINK socket. You could still hand in a netlink-formatted stream via ioctl.

Netlink has its own share of issues. One is the asynchronous model and the programming that this entails; you need to do two at least two calls (send/recv). Another is the 16-bit size fields in its serialization format. If you can't fit your stuff into one message because of that limit, you have to go on a multi-roundtrip endeavour from userspace. The kernel part meanwhile may need to keep extra state to logically tie the individual netlink messages together. That's kinda terrible.

Speaking of ioctl, setsockopt is just the same.

ioctl() forever?

Posted Jun 9, 2022 11:42 UTC (Thu) by jazzy (subscriber, #132608) [Link]

Has someone considered copying Windows?

BOOL DeviceIoControl(
[in] HANDLE hDevice,
[in] DWORD dwIoControlCode,
[in, optional] LPVOID lpInBuffer,
[in] DWORD nInBufferSize,
[out, optional] LPVOID lpOutBuffer,
[in] DWORD nOutBufferSize,

Basically it allows for invoking an IOCTL and pass in and/or pass out a continous buffer. To me this seems extendable and avoids the need to embed pointers. It also clearly specifies what is readable and writable.

ioctl() forever?

Posted Jun 9, 2022 19:02 UTC (Thu) by camhusmj38 (subscriber, #99234) [Link]

I imagine they would rather eat their own hats.

ioctl() forever?

Posted Jun 9, 2022 19:25 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

I like how the metadata is "optional" when it probably means "nullable". I doubt calling this without those arguments at all works out that well in practice.

ioctl() forever?

Posted Jun 9, 2022 19:27 UTC (Thu) by camhusmj38 (subscriber, #99234) [Link]

I looked it up - the rules are documented. Which combination of nulls are allowed - if you go the overlapped (async) route you can leave the number of bytes out null and vice versa.

ioctl() forever?

Posted Jun 9, 2022 19:47 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

It works fine for IOCTLs that don't need input or output parameters.

ioctl() forever?

Posted Jun 9, 2022 15:00 UTC (Thu) by karim (subscriber, #114) [Link]

Just on the discoverability front, it would be great if there was a way to query any ioctl() interface for the commands it accepts and the parameters expected. A sort of obligatory man_ioctl() call on any kernel party that wants to expose ioctl(). I could then do something like "$ ioman /dev/foobar" and get a list of list of parameters, etc. Maybe even gate access to register an ioctl() on having said man_ioctl() registered along with it.

ioctl() forever?

Posted Jun 9, 2022 15:43 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

So, uh, what format will this communicate the requirements in? Snippets of C headers? ;)

ioctl() forever?

Posted Jun 9, 2022 15:54 UTC (Thu) by karim (subscriber, #114) [Link]

man pages may have good examples. If you do a "man 2 open", for instance, it lists all possible open flags with a description. Maybe something like that?

ioctl() forever?

Posted Jun 9, 2022 16:27 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

From `ioman(1)`, sure. But what will `man_ioctl(2)` use to communicate whatever gets converted into groff format (or is the kernel going to speak groff directly?).

ioctl() forever?

Posted Jun 9, 2022 16:39 UTC (Thu) by karim (subscriber, #114) [Link]

Very good question. I can't say I thought it all the way through. Maybe another inspiration could be online help output from commands that print options when you provide "-h" ... i.e. just some free-form text to be printed on screen when invoked. It's a rough cut of an idea. Definitely requires some more forethought to be useful. But, personally, I'd love to have a way to ask about the ioctl()s available for any /dev/foo and even have a tool to invoke them from the command line if it makes sense.

ioctl() forever?

Posted Jun 9, 2022 17:04 UTC (Thu) by NYKevin (subscriber, #129325) [Link]

You can't force people to write documentation. If you try, you just end up with something like the following:

> int FrobnicateSprocket(Sprocket *s)
> Frobnicates a sprocket.
>
> Arguments:
> s: The sprocket to frobnicate.
>
> Returns:
> Zero if no error, or nonzero and sets errno.
>
> Errno values:
> EPERM - Frobnicating this sprocket is not permitted.

IMHO if you're going to do something like this, it should be a machine-readable enumeration of commands, not a man page.

ioctl() forever?

Posted Jun 11, 2022 12:26 UTC (Sat) by pm215 (subscriber, #98099) [Link]

If you're the kernel maintainers you absolutely can force people to write documentation -- just refuse to merge any patch that adds a new syscall or ioctl and doesn't include documentation as part of the patchset. You can set the quality bar at any level you like. If stuff slips through the code review stage without sufficient documentation, you can revert it.

In other words, allowing or not allowing undocumented new interfaces is a choice, just as allowing or not allowing changes that break userspace is a choice.


Copyright © 2022, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds