The problematic kthread freezer
The first problem, he said, is that the freezer's semantics are not well defined; nobody really knows what it means for a kthread to be frozen. Most of the current uses of the freezer are superfluous. In many cases, the purpose is to have filesystems be in a consistent state during hibernation; that can be better achieved with the filesystem freeze mechanism. It doesn't make sense to freeze I/O operations in general, since they are needed to write out the hibernation image. There is a lot of freezing in drivers too, a situation which, he said, makes no sense. There is a well-defined set of power-management callbacks in place to put drivers into a suspended state during hibernation.
The kernel, he said, is the victim of a massive copy-and-paste cargo cult. Uses of the kthread freezer are spreading like a disease, a situation that has to stop.
There are two especially pathological uses that he called out. One is try_to_freeze() calls for threads that have not been marked freezable in the first place; those calls will never have any effect. The other is try_to_freeze() calls after starting I/O, but without waiting for that I/O to complete.
The solution is to eliminate use of the kthread freezer wherever possible. It is not needed in threads that will not generate disk I/O. It is also not needed — indeed, its use is a bug — in I/O helper threads. The best solution would be to move the entire hibernation subsystem to use filesystem freezing instead, and simply get rid of the kthread freezer. It might be necessary to keep it around for NFS, he said, but there's not much else that should need it. But the first step is to stop its use from spreading.
Ben Herrenschmidt spent a while talking about the history of the freezer,
which, he said, was invented as "a big, fat band-aid" without which the
system could not suspend properly. Now, instead, we simply need to make
our drivers cope properly with I/O during a suspend operation. As the
session closed, Linus agreed that the best approach was to get rid of the
kthread freezer altogether and to use filesystem freezing where it is
really needed. So one should expect development to go in that direction.
Index entries for this article | |
---|---|
Kernel | Kernel threads |
Conference | Kernel Summit/2016 |
(Log in to post comments)
The problematic kthread freezer
Posted Nov 3, 2016 2:03 UTC (Thu) by trondmy (subscriber, #28934) [Link]
Thanks for the offer, but no thanks. The kthread freezer is borked for NFS as well, and we'd rather get rid of it.
The problematic kthread freezer
Posted Nov 3, 2016 11:26 UTC (Thu) by jlayton (subscriber, #31672) [Link]
The problematic kthread freezer
Posted Nov 5, 2016 14:53 UTC (Sat) by jikos (subscriber, #43140) [Link]
The problematic kthread freezer
Posted Nov 9, 2016 15:29 UTC (Wed) by jlayton (subscriber, #31672) [Link]
Basically what I think we'd want to do is to have fsfreeze tell the RPC transport layer that it should stop sending RPCs to the server(s) and drain the queue by waiting on replies to come in.
The question though is what to do with threads sitting in syscalls that need to issue an RPC. "Parking" them down at the layer where we're synchronously waiting for an RPC reply would be bad, as it would mean that we could easily be holding vfs-layer locks at that point (inode->i_rwsem for instance).
How should that work?
The problematic kthread freezer
Posted Nov 7, 2016 19:30 UTC (Mon) by Alan.Stern (subscriber, #12437) [Link]