Linux 5.12's very bad, double ungood day
On January 26, Christoph Hellwig posted a 17-patch series containing cleanups to the code dealing with the allocation of the BIO structures used to represent block-I/O requests. The final patch in that series simplified the allocation of requests for swap files in particular. The series was applied by block maintainer Jens Axboe one day later. The change was included in this pull request sent on February 17, and landed in the mainline on February 21 as part of the massive set of pulls done by Torvalds once his power was restored.
"Swapping" is how the kernel pushes anonymous pages (those which are not backed up by a file on disk — program data, in other words) out to persistent storage when memory gets tight. Linux can swap directly to a partition on a block device; that is how systems are traditionally set up. But the kernel can also swap to a file within a mounted filesystem with something close to the same performance. Swapping to a file gives administrators some additional flexibility; it is an easy way to give some relief to a system that is running short of memory without disturbing the existing workload, for example.
The problematic patch works just fine for swap activity to a partition. But, when a swap file is in use, the offset to the location of its blocks within the filesystem would be lost. That had the result of redirecting swap traffic to an essentially random location on the filesystem — a location that was not intended for swapping and may well have contained other information that somebody would have preferred to keep. In other words, the filesystem containing the swap file would be corrupted. (Interestingly, a similar bug was introduced in 2014 and lurked undetected because it only affects extremely rare configurations.)
On March 2, a fix
was applied to the mainline repository.
One day later, Torvalds sent a
warning to the linux-kernel mailing list describing the problem and
noting that: "This is what we in the industry call 'double
ungood'
". He removed the 5.12-rc1 tag from his repository (or,
rather, renamed it to "5.12-rc1-dontuse") and generally made it clear that
this release should be avoided. He also had a couple of requests for
maintainers.
One had to do with the 5.12-rc1 tag; he can remove it from his repository, but that does not change anybody else's repository. So, he said, maintainers should manually remove that tag themselves just to prevent any accidental use of that release.
Beyond that, it is common for subsystem maintainers to base new branches on
an -rc release, even -rc1. But any such branches pose a particular hazard
for the development and testing community in general, because they will not
contain the fix for this problem. The git bisect tool, which is
frequently used to find the patch that caused a regression, can land
on any location in the development history; branches containing the buggy
commit (but not the fix) create extended ranges of history with the
potential to destroy filesystems far into the future. This, also, seems
"'double ungood'
".
The answer is to not base development branches on 5.12-rc1, but instead to use either an older release or to wait until -rc2. That will not be entirely helpful for subsystem maintainers who have already published branches with an -rc1 base, as some certainly have. They have a choice between keeping the dangerous branch or rebasing it onto a safer branch point, possibly creating difficulties for anybody else who has built on that branch. That sort of rebasing would normally be frowned upon, but Torvalds made it clear that it is permissible — and expected — this time.
The kernel community is mostly treating this as a case of "bugs happen" and moving on. One can indeed argue that the process has worked the way it is supposed to: a catastrophic bug was caught early during the stabilization period and will never make it anywhere near a production system. That said, there may be room for a bit of introspection and thinking about how things could have been done better.
For example, one might argue that the review process could have worked better. The patch in question was only posted once, and was applied to the block repository the next day. The patch in the mainline repository contains two Reviewed-by tags and one Acked-by tag, but the mailing-list thread shows those tags all being offered for a different patch in the series. Perhaps all three reviewers offered their tags off-list but, from the public record, it's not clear how much review the problematic patch actually received.
There is no guarantee that more eyeballs would have found this bug before it was committed, but it seems that those eyeballs were not given much of a chance here. It is tempting to see "cleanup" patches as being inherently safe and needing less review, but those patches can contain subtle bugs as well.
A part of the process that could have worked more quickly was warning the community as a whole of the problem. In the message linked above, Torvalds acknowledged that he should have acted more quickly rather than waiting to be sure that the proposed fix actually made the problem go away. One can only imagine that the warning will go out more quickly the next time.
There is one other little problem here that surprisingly few people are talking about. The health of the mailing-lists hosted on vger.kernel.org has not been optimal for a while; at times, posts to the mailing lists have been delayed for a day or two or dropped altogether. Torvalds's warning was not delayed that badly, but it still took several hours to get to recipients of linux-kernel. Whether the delay in the delivery of that warning caused more testers to be bitten by this bug is not clear, but it certainly did not help.
The good news is that work is currently being done to put the vger lists on a more solid, well-supported footing. With luck, the mailing-list problems will be behind us in the relatively near future.
Even then, though, a warning sent just to linux-kernel has a real chance of disappearing into the noise. The traffic on that list, which routinely exceeds 1,000 messages per day, can drown out important messages and has caused many developers to unsubscribe completely. It may well be that we need better communication channels for this kind of urgent message.
In the end, the actual pain caused by this bug was probably relatively
small; there have not been massive numbers of "it ate my filesystem!"
messages going around. The process did mostly work the way it was intended
to. With luck, lessons learned from this incident can help that process to
work even better the next time.
Index entries for this article | |
---|---|
Kernel | Releases/5.12 |
(Log in to post comments)
Linux 5.12's very bad, double ungood day
Posted Mar 8, 2021 17:06 UTC (Mon) by pbonzini (subscriber, #60935) [Link]
The commit message doesn't say why what is essentially- bio->bi_iter.bi_sector = map_swap_page(page, &bdev); + bio->bi_iter.bi_sector = swap_page_sector(page);would have been the right replacement. I suppose that's where the bug lies?
Linux 5.12's very bad, double ungood day
Posted Mar 8, 2021 17:56 UTC (Mon) by farnz (subscriber, #17727) [Link]
The bug is trivially awful (because it's obvious when you see it, but not when you're working on it); for most of the swap code, the swap location for any given page is referenced as an offset from the beginning of the swap file (or device). For a swap partition, this is fine - the offset from the beginning of the device is the correct location in the swap partition, too. For swap files, the correct offset on the device has to account for where on the device the filesystem has placed swap extents.
In the old code, map_swap_page calls map_swap_entry, which accounts for this offset. In the new code, it calls swap_page_sector, which was previously only ever used for swap partitions, and thus didn't account for the filesystem offset. The fix is to change swap_page_sector to account for swap files, and thus to look at the filesystem-provided extent map. That way, the foot gun is removed for everyone.
Linux 5.12's very bad, double ungood day
Posted Mar 8, 2021 18:10 UTC (Mon) by pbonzini (subscriber, #60935) [Link]
And doing that without pointing it out in the commit message is super sloppy. If you're saying that you're Just Inlining Code, you'd better just inline code
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 8:32 UTC (Tue) by epa (subscriber, #39769) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 8:36 UTC (Tue) by epa (subscriber, #39769) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 19:09 UTC (Tue) by NYKevin (subscriber, #129325) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 21:01 UTC (Tue) by error27 (subscriber, #8346) [Link]
https://github.com/error27/rename_rev
It also has a -r NULL mode to review changes like "if (foo != NULL) {" --> "if (foo) {" because sometimes people get those transitions inverted. A bunch of little tricks like that.
I sometimes think like you do about ways to do something similar at the bytecode level or with static analysis but I can't think how that would work...
Linux 5.12's very bad, double ungood day
Posted Mar 10, 2021 7:23 UTC (Wed) by epa (subscriber, #39769) [Link]
Such a machine-readable record of the refactoring change would also be handy when rebasing. Instead of hitting a merge conflict on the 'renamed variable' commit, you could reapply that change using the refactoring tool. As long as both the old commit and the new one are pure refactorings (logical bytecode doesn't change), this can be done automatically as part of the rebase process, leaving the human programmer to concentrate on the conflicts that aren't trivial.
Linux 5.12's very bad, double ungood day
Posted Mar 10, 2021 8:10 UTC (Wed) by error27 (subscriber, #8346) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 20, 2021 23:50 UTC (Sat) by Hi-Angel (guest, #110915) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 21, 2021 5:34 UTC (Sun) by pabs (subscriber, #43278) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 8, 2021 17:12 UTC (Mon) by PaulMcKenney (subscriber, #9624) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 8, 2021 17:25 UTC (Mon) by brice (subscriber, #35493) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 8, 2021 17:30 UTC (Mon) by syrjala (subscriber, #47399) [Link]
An excuse I would accept is that appropriate tests were run but they simply did not catch the problem. In which case I would expect to see improvements to the tests (+ making sure someone is actually running them in their CI system) to make sure this same bug doesn't happen ever again.
Linux 5.12's very bad, double ungood day
Posted Mar 8, 2021 19:28 UTC (Mon) by mathstuf (subscriber, #69389) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 8, 2021 20:34 UTC (Mon) by flussence (subscriber, #85566) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 8, 2021 21:56 UTC (Mon) by mathstuf (subscriber, #69389) [Link]
https://patchwork.kernel.org/project/keyrings/patch/13239...
Its state is "New", but it has been dropped by the submitter. How would one change the state from "New" to "Dropped"?
This one is tracked as "New", but has been merged (as noted by the pr-tracker-bot). How did Patchtracker miss this?
https://patchwork.kernel.org/project/keyrings/patch/13228...
And this is for pull requests.
This one is "New", but has a "Reviewed-by" someone who can collect it for the subsystem. Where is it on its way into the appropriate tree(s)?
https://patchwork.kernel.org/project/keyrings/patch/20210...
And that's just the one list I follow loosely. How can one trust this site for actual tracking of arbitrary kernel patches with this quality of tracking? Sure, this list might not *use* patchtracker as much as other lists, but how does one determine *that* bit of information?
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 16:12 UTC (Tue) by andy_shev (subscriber, #75870) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 20:29 UTC (Tue) by marcH (subscriber, #57642) [Link]
patchwork tries to find code submissions not addressed to it directly. All other code review solutions require everything to be sent to them directly, they act as gateways. That's the very simple reason why they have all the information and all work better.
The icing on the cake is of course recipient-controlled email notifications instead of sender-control notifications (a.k.a... "spam"). In other words people routinely subscribe and unsubscribe to/from individual code reviews; not possible with a crude mailing list.
None of this is rocket science, it's all very easy to see and understand except when obsessing about the "email versus web" user interface questions; these debates are not "wrong" but they hide the much more important backend and database design issues.
You could totally have a gateway type code review tool entirely driven by email. In fact requesting all submissions, review comments and approvals or rejections to be sent (by email) to patchwork directly and putting patchwork in better control of the git repo would get at least half-way there.
Linux 5.12's very bad, double ungood day
Posted Mar 11, 2021 16:04 UTC (Thu) by mathstuf (subscriber, #69389) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 8, 2021 20:19 UTC (Mon) by cesarb (subscriber, #6266) [Link]
Even though it's called "swapfile.c", that file has the code for both swap files and swap partitions, including things like the swapon and swapoff system calls. Surprisingly little of that code is specific to either swap files only or swap partitions only, nearly all of it being in the setup and teardown of the swap areas (setting the block size on swapon for partitions, reading the swap extents from the filesystem on swapon for regular files). Search that file for S_ISBLK and S_ISREG to see the few places where the code diverges.
Linux 5.12's very bad, double ungood day
Posted Mar 8, 2021 23:17 UTC (Mon) by tome (subscriber, #3171) [Link]
Indeed this bug could pass tests on a swap file accidentally by just clobbering unused locations . If so those tests need more fuzz.
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 8:21 UTC (Tue) by matthias (subscriber, #94967) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 8:44 UTC (Tue) by marcH (subscriber, #57642) [Link]
So back to the original question: which "swaptorture" test suite was used to validate this patch and did that test suite include configuration(s) with swap file(s)? The main article wonders "how things could have been done better" and asks some interesting code review questions but does not seem to mention anything like "test gap".
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 16:43 UTC (Tue) by mathstuf (subscriber, #69389) [Link]
- initialize a swapfile
- do things with it
- copy it to a new, fresh FS
- restore from it
That at least gets the "everything was written to the swapfile" (rather than willy-nilly across the hosting FS). Doesn't guarantee that *extra* writes didn't go out. Some form of writing a given bitpattern to the entire FS and then ensuring that post-FS init and swap file creation, nothing else changes (modulo mtime/inode updates) might suffice? I have no idea if such a "simple" FS exists though. Or create a single file which takes up the remaining FS space with a given bitpattern reads properly after using the swapfile. Wrecking any data or metadata would presumably be detectable then, no?
Linux 5.12's very bad, double ungood day
Posted Mar 10, 2021 1:47 UTC (Wed) by NYKevin (subscriber, #129325) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 8, 2021 19:28 UTC (Mon) by luto (subscriber, #39314) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 8, 2021 20:13 UTC (Mon) by mathstuf (subscriber, #69389) [Link]
Git just generally has a problem with shipping configuration values around. I can't make bullet-proof commands for others to use because I don't even know what remote names are in use. I vastly prefer `origin` for the main repo and `$prefix/$username` for any forks (including mine and the prefix depends on the hosting service). Others prefer `origin` for their fork and `upstream` for the main repository. Useful aliases, hooks, etc. all have non-existent ways of being distributed in git today. We have a script that developers should run to at least match the docs, but that's not generally used.
Linux 5.12's very bad, double ungood day
Posted Mar 8, 2021 20:28 UTC (Mon) by pebolle (subscriber, #35204) [Link]
- restart at v5.11;
- add everything in v5.12-rc1-dontuse except the misguided commit;
- add the misguided commit and it's fix squashed as a single commit;
- add everything else that is now in v5.12-rc2 but not in v5.12-rc1-dontuse.
That should make v5.12-rc1-dontuse a dead end which no-one would ever hit while bisecting. (And v5.12-rc2 basically a second try at a v5.12-rc1.) Or doesn't git work like that?
Linux 5.12's very bad, double ungood day
Posted Mar 8, 2021 21:13 UTC (Mon) by josh (subscriber, #17465) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 8, 2021 21:16 UTC (Mon) by geert (subscriber, #98403) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 8:46 UTC (Tue) by marcH (subscriber, #57642) [Link]
That's the goal.
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 10:21 UTC (Tue) by pr1268 (subscriber, #24648) [Link]
Following your approach would rebase the master branch [...], which is something Linus does not want to do ever.
But, in this particular case, he's making an exception. From our editor's article (emphasis mine):
That sort of rebasing would normally be frowned upon, but Torvalds made it clear that it is permissible — and expected — this time.
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 10:44 UTC (Tue) by geert (subscriber, #98403) [Link]
If Linus had been comfortable about rebasing his own master branch, he would have done so for sure.
Linux 5.12's very bad, double ungood day
Posted Mar 8, 2021 19:30 UTC (Mon) by amarao (subscriber, #87073) [Link]
Basically, in this story I can't see the finale: rather trivial test to see that filesystem with swap file is not corrupted after swap in / out in a pagefile on it.
Even without kernel instrumentation it would take about hour or two to develop.
Linux 5.12's very bad, double ungood day
Posted Mar 8, 2021 21:20 UTC (Mon) by roc (subscriber, #30627) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 8, 2021 21:42 UTC (Mon) by airlied (subscriber, #9104) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 8, 2021 23:18 UTC (Mon) by roc (subscriber, #30627) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 4:26 UTC (Tue) by airlied (subscriber, #9104) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 12:01 UTC (Tue) by roc (subscriber, #30627) [Link]
I am constantly frustrated by the kernel testing culture, or lack of it. rr has approximately zero resources behind it and our automated tests are still responsible for detecting an average of about one kernel regression every release.
Every time I bring this up people have a string of excuses, like how it's hard to test drivers etc. Some of those are fair, but the bug in this article and pretty much all the regressions found by rr aren't in drivers, they're in core kernel code that can easily be tested, even from user space.
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 13:27 UTC (Tue) by pizza (subscriber, #46) [Link]
Okay, so... when exactly?
There are 10K commits (give or take a couple thousand) that land in every -rc1. Indeed, until -rc1 lands, nobody can really be sure if a given pull request (or even a specific patch) will get accepted. This is why nearly all upstream tooling treats "-rc1" as the "time to start looking for regressions" inflection point [1], and they spend the next *two months* fixing whatever comes up. This has been the established process for over a decade now.
So what if there was a (nasty) bug that takes down a test rig? That's what the test rigs are for! The only thing unusual about this bug is that it leads to silent corruption, to the point where "testing" in of itself wasn't enough; the test would have had to been robust enough to ensure nothing unexpected was written anywhere to the disk. That's a deceptively hairy testing scenario, arguably going well beyond the tests folks developing filesystems run.
Note I'm not making excuses here; it is a nasty bug and clearly the tests that its developers ran was insufficient. But it is ridiculous to expect "release-quality" regression testing to be completed at the start of the designated testing period.
[1] Indeed, many regressions are due to the combinations of unrelated changes in a given -rc1; each of those 10K patches in of themselves is fine, but (eg) patch #3313 could lead to data loss, but only in combination of a specific kernel option being enabled, and run on a system containing an old 3Ware RAID controller and a specific motherboard with a PCI-X bridge that can't pass through MSI interrupts due to how it was physically wired up. [2] [3]
[2] It's sitting about four feet away from me as I type this.
[3] Kernel bugzilla #43074
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 14:11 UTC (Tue) by epa (subscriber, #39769) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 15:12 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 15:22 UTC (Tue) by geert (subscriber, #98403) [Link]
$ git tag --contains 48d15436fde6
next-20210128
next-20210129
next-20210201
next-20210202
next-20210203
next-20210204
next-20210205
next-20210208
next-20210209
next-20210210
next-20210211
next-20210212
next-20210215
next-20210216
next-20210217
next-20210218
next-20210219
next-20210222
next-20210223
next-20210224
next-20210225
next-20210226
next-20210301
next-20210302
next-20210303
next-20210304
next-20210305
next-20210309
v5.12-rc1
v5.12-rc1-dontuse
v5.12-rc2
Three weeks passed between the buggy commit entering linux-next and upstream.
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 15:35 UTC (Tue) by pizza (subscriber, #46) [Link]
> Three weeks passed between the buggy commit entering linux-next and upstream.
So the "problem" here isn't that nothing was being tested, it's just that none of the tests run during this interval window caught this particular issue. It's also not clear that there was even a test out there that could have caught this, except by pure happenstance.
But that's the reality of software work; a bug turns up, write a test to catch it (and hopefully others of the same class), add it to the test suite (which runs as often as your available resources allow) .... and repeat endlessly.
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 15:25 UTC (Tue) by pizza (subscriber, #46) [Link]
Not that it will stop folks complaining when "5.32-alpha0-rc4-pre3" fails to boot on their production system, obviously because it should have been tested first, and we need a pre-pre-pre-pre-pre release snapshot to start testing against.
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 15:26 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 17:47 UTC (Tue) by Wol (subscriber, #4433) [Link]
Horse to water and all that ...
Cheers,
Wol
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 19:55 UTC (Tue) by mathstuf (subscriber, #69389) [Link]
But this kind of one-off code is annoying to test itself and someone will script adding it to their boot command lines anyways.
Linux 5.12's very bad, double ungood day
Posted Mar 10, 2021 21:58 UTC (Wed) by sjj (subscriber, #2020) [Link]
I don’t think I’ve built a kernel in 10 years, or maybe that one time 7-8 years ago.
Linux 5.12's very bad, double ungood day
Posted Mar 10, 2021 22:22 UTC (Wed) by roc (subscriber, #30627) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 11, 2021 8:43 UTC (Thu) by pbonzini (subscriber, #60935) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 10, 2021 23:20 UTC (Wed) by Wol (subscriber, #4433) [Link]
You clearly don't run gentoo :-)
Cheers,
Wol
Linux 5.12's very bad, double ungood day
Posted May 2, 2021 2:58 UTC (Sun) by pizza (subscriber, #46) [Link]
I saw this scroll by when I upgraded this system to Fedora 34:
$ rpm -q icedtea-web
icedtea-web-2.0.0-pre.0.3.alpha16.patched1.fc34.3.x86_64
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 22:44 UTC (Tue) by dbnichol (subscriber, #39622) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 10, 2021 2:43 UTC (Wed) by roc (subscriber, #30627) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 10, 2021 2:34 UTC (Wed) by roc (subscriber, #30627) [Link]
In practice, large projects often try to maximise bang-for-the-buck by dividing tests into tiers, e.g. tier 1 tests run on every push, tier 2 every day, maybe a tier 3 that runs less often. Many projects use heuristics or machine learning to choose which tests to run in each run of tier 1.
Yes, I understand that it's difficult to thoroughly test weird hardware and configuration combinations. Ideally organizations that produce hardware with Linux support would contribute testing on that hardware. But even if we ignore all those bugs, there are still lots of core kernel bugs not being caught by kernel CI.
Linux 5.12's very bad, double ungood day
Posted Mar 8, 2021 23:33 UTC (Mon) by mss (subscriber, #138799) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 14:29 UTC (Tue) by daenzer (subscriber, #7050) [Link]
E.g. it shouldn't be hard to hook up tests to a GitLab CI pipeline which can catch this bug (and more), and only allow merging changes which pass the tests.
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 14:54 UTC (Tue) by pbonzini (subscriber, #60935) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 16:12 UTC (Tue) by pabs (subscriber, #43278) [Link]
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 17:32 UTC (Tue) by daenzer (subscriber, #7050) [Link]
And there are CI pipelines on https://gitlab.freedesktop.org/ running VMs via KVM, which should allow booting kernels built as part of the pipeline as well. (Another possibility is having dedicated test machines which can be powered on/off remotely; Mesa is using a number of those for testing GPU drivers)
Linux 5.12's very bad, double ungood day
Posted Mar 9, 2021 21:11 UTC (Tue) by marcH (subscriber, #57642) [Link]
I think the gap in the article hides a much more worrying gap. There's a long and interesting discussion in the middle of the comments about test setups and workflows with not a single link to .... test code. Is there any?
Automated tests
Posted Mar 9, 2021 21:33 UTC (Tue) by corbet (editor, #1) [Link]
One could certainly write an automated test to catch this, but it would not be easy. You'd need to set up a machine with a swap file, drive it into memory pressure with a lot of dirty anonymous pages, then somehow verify that none of the swap traffic went astray. That means comparing the entire block device (including free space) with what you expect it to be, or mapping out the swap file, picking the swap traffic out of a blktrace stream, and ensuring that each page goes to the right place.Certainly doable, but this would not be a fast-running test.
Automated tests
Posted Mar 10, 2021 2:49 UTC (Wed) by roc (subscriber, #30627) [Link]
You'd have to write block device verification code to check the free space and the contents of all files, but that code could be useful for detecting all kinds of bugs.
One thing about automated testing is that once you bite the bullet and start creating infrastructure for things that look hard to test, you make it easier to test all kinds of things and people are much more willing to write tests for all kinds of things as part of their normal development. So the sooner you create such infrastructure, the better.
Automated tests
Posted Mar 10, 2021 2:52 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]
I was one of people responsible for getting EC2 instances to hibernate. We used files for hibernation and actually found that the kernel had been broken for YEARS with file hibernation (it required a reboot for the hibernation target setting to take effect).
We also had a test for this very issue. It created an EC2 instance with a small disk (~2Gb) and limited RAM (512Mb). The test program created a swap file and then filled the disk to capacity with pseudo-random numbers (by creating a file and writing to it). It then allocated enough pseudo-random data to swap out at least some of it.
Then hibernate, thaw, and checksum the disk and the data in RAM to check for corruption.
The tests ran in about 2 minutes.
Automated tests
Posted Mar 10, 2021 22:12 UTC (Wed) by sjj (subscriber, #2020) [Link]
Curious what the use case for it in AWS is.
Hibernation
Posted Mar 10, 2021 22:18 UTC (Wed) by corbet (editor, #1) [Link]
That work, and the use case behind it, were discussed in this OSPM article from last May.
Automated tests
Posted Mar 10, 2021 6:42 UTC (Wed) by marcH (subscriber, #57642) [Link]
As this is an apparently common configuration, you'd expect anyone modifying swap code to grant that configuration some reasonable amount of test time.
> drive it into memory pressure with a lot of dirty anonymous pages,
Also called "swapping"?
> then somehow verify that none of the swap traffic went astray.
Unless you're re-installing your entire system every few tests, there's a good chance you will soon notice something somewhere has gone terribly wrong even when not verifying every byte on the disk. This is apparently how the bug was found and relatively quickly by people not even testing swap but other things. The perfect that does not get done is the enemy of the good that does and this is especially true with chronically underdeveloped validation.
Automated tests
Posted Mar 11, 2021 19:56 UTC (Thu) by thumperward (guest, #34368) [Link]