|
|
Subscribe / Log in / New account

Improving memory-management documentation

By Jonathan Corbet
May 10, 2022
LSFMM
Like much of the kernel, the memory-management subsystem is under-documented, and much of the documentation that does exist is less than fully current. At the 2022 Linux Storage, Filesystem, Memory-management and BPF Summit (LSFMM), Mike Rapoport ran a session on memory-management documentation and what can be done to improve it. The result was a reinvigorated interest in documentation, but only time will tell what actual improvements will come from that interest.

Rapoport started by noting that, a couple of years ago, he took a hard look at the current state of memory-management documentation. What he found was summarized as "Mel's book and some text files". The book in question, Mel Gorman's Understanding the Linux Virtual Memory Manager, is reminiscent of many old Unix books: the basic concepts still apply to a great extent, but the details are all out of date and, thus, wrong. There are many important memory-management features, such as transparent huge pages, that are not mentioned at all.

With regard to the text files in the kernel's documentation directory, Rapoport made the effort to convert them over to restructured text and integrate them into the kernel's documentation system, adding a bit of [Mike Rapoport] much-needed organization in the process. He added some coverage of internal APIs, but there is a lot that is still in need of improvement. So, he asked, what can be done to improve the documentation and encourage the writing of more documentation?

One idea, he continued, was for reviewers to make a point of reviewing the associated documentation when looking at memory-management patches. He has been making an effort in that direction, but has not seen other reviewers following suit. Matthew Wilcox jumped in to note that the maple tree patches are well documented. Rapoport agreed, but said that doesn't change the fact that there is a lot of "tribal wisdom" in the memory-management community that does not exist in written form.

Another developer noted that the documentation can be found in two distinct places: in the code, and under the kernel's documentation directory (Documentation/). The latter documentation, he said, is not as good as it could be. There are some sections written in a clear narrative file, but it is mixed in with "noise and horrible stuff". The rendered documentation, which incorporates kerneldoc comments from the code into the separate documents, can jumble everything together and can be hard to work with. Andrew Morton said that Documentation/ is good for user-facing material, but otherwise the right place for documentation is in the code itself.

As the maintainer for the documentation directory, I felt the need to jump in at this point; I had to disagree with Morton's assertion that separate documentation is only good for end users. There is a lot of information that is relevant to developers, but which doesn't fit readily into kerneldoc comments, and it is hard to tell a coherent story in the code that way. The idea that comments in the code will be better maintained than separate documentation is a poor match to reality at best.

With regard to organization, it is possible to put introductory and contextual information into kerneldoc comments and produce a coherent manual from them, but extra effort must be made toward that end, and the end result only appears in the documents after being rendered by the build system — not in the code. The DRM documentation is a good example of what can be done when developers put effort into it.

That said, organization has been an issue all along; when I became the documentation maintainer, the kernel's documentation directory was a seemingly random collection of independent files. Over the years, developers working on the documentation have been trying to organize that material with a focus on who the readers are; thus the Core API manual, the User-space API guide, the Maintainer handbook, and several others. The net effect has been to create a set of smaller piles of unorganized and often outdated material, but it's a start. But people rarely find time to try to improve those manuals or to turn each into a coherent document rather than a collection of related files.

Wilcox mentioned Neil Brown's readahead documentation as an example of another type of problem. The new documentation is "90% right"; Wilcox should have reviewed it but was not copied on it was unable to find the time. Brown, he said, did not use the documentation that was already present in the code when doing his work, and that is frustrating.

A recurring theme was that there are not enough people with the time and expertise to work on documentation; developers were encouraged to lobby their employers to support that work. Michal Hocko said that you can't bring in a "random tech writer" to work on memory-management documentation, though; a lot of knowledge is needed to write useful documentation, so experienced people need to write it. Brown's approach was excellent, since he is an expert user of the interface and can see it through those eyes. Meanwhile, Hocko said, he generally avoids looking through the code when in search of documentation and digs through the LWN archives instead.

I agreed with Hocko but had to add that writing documentation is a good way to gain the needed expertise. I learned much of what I know when working on Linux Device Drivers; it's fair to say that I was not well qualified when I began that project.

Davidlohr Bueso claimed that the best document in the kernel, the one that others should emulate, is the infamous memory-barriers.txt. It is written by developers with a high level of expertise, is clear, and actively maintained; even a newcomer can get something out of it. Johannes Weiner said that one of the strengths of memory-barriers.txt is that the document has had an excellent structure from the beginning; that made it relatively easy for others to come along and add to it. The memory-management subsystem needs somebody to come along and create a similar sort of documentation structure.

Dan Williams asked what the near-term focus for memory-management documentation should be; did Rapoport have specific APIs in mind? Rapoport answered that his goal was to make the documentation better in general so that others could understand how Linux memory management works. Williams said that was "a good mission statement", but he was looking for actionable tasks. Rapoport suggested speculative page faults or the multi-generational LRU (both of which are still out of tree) as examples.

Kent Overstreet said that developers are not bringing up documentation during code review, and that the subsystem does not have a person who has a coherent view of what the documentation should look like. Liam Howlett said that, as a new memory-management developer, he has encountered many functions that he did not understand. When he changes code, he tries to improve its documentation. He mentioned find_vma() specifically as a function whose behavior doesn't really match its name or documentation.

David Hildenbrand asked what documents were wanted for memory management in general. Rapoport answered that he would like to see more material in the admin guide first, preferably a high-level overview of how it all works. Improving the kerneldoc comments is rather lower on his list, but it is also easier to do. There was some discussion around whether there was a greater need for internal or user-oriented documentation; it was suggested that perhaps developers over-document some internal APIs, causing users to use them when they really should not. find_vma() was mentioned again as an example of this sort of problem.

At the conclusion of the session, Wilcox suggested that a good first step would be to create a new memory-management document using Gorman's book as a guide, and volunteered to take a stab at it. That book had a structure that clearly worked; starting with that would solve the organizational problem and make it easy for developers to improve things. A ReStructured Text file could be created along those lines, and the existing documentation could be slotted into it as appropriate. There was a general agreement that this was a good thing to do — no doubt helped by the existence of a developer who was willing to take the initial steps. Wilcox has since posted an initial version of the new documentation structure for review.

Index entries for this article
KernelDocumentation
KernelMemory management/Documentation
ConferenceStorage Filesystem & Memory Management/2022


(Log in to post comments)

Improving memory-management documentation

Posted May 10, 2022 14:17 UTC (Tue) by willy (subscriber, #9762) [Link]

Minor correction; I was copied on Neil's readahead documentation rewrite, I just didn't find time to review it before it went in.

The pre-existing documentation can be found between get_next_ra_size() and count_history_pages(). Obviously...

Improving memory-management documentation

Posted May 10, 2022 20:21 UTC (Tue) by dcg (subscriber, #9198) [Link]

As an user who sometimes wants to have a very high-level view of how MM things work, my go to source for documentation are LWN articles...

Improving memory-management documentation

Posted May 11, 2022 4:17 UTC (Wed) by unixbhaskar (guest, #44758) [Link]

....and you are not alone! Although, I am late to the party ... it seems worth the time to read some articles here.

Improving memory-management documentation

Posted May 11, 2022 17:27 UTC (Wed) by jezuch (subscriber, #52988) [Link]

I wonder... Do the LWN articles that we all love and enjoy actually end up as official documentation?

Official documentation

Posted May 11, 2022 17:57 UTC (Wed) by corbet (editor, #1) [Link]

Occasionally, yes; this article is the most recent example.

Improving memory-management documentation

Posted May 12, 2022 15:53 UTC (Thu) by tsr2 (subscriber, #4293) [Link]

To get documentation correct may require a high level of expertise, but writing it from that perspective tends to result in documentation that also requires a high level of expertise to follow. To make it comprehensible to those with less expertise, perhaps it should be written by those with lesser expertise and reviewed for correctness/completeness by the experts?

Improving memory-management documentation

Posted May 13, 2022 11:24 UTC (Fri) by farnz (subscriber, #17727) [Link]

What it really needs is someone whose job is technical writing to talk to the experts and document it. There's a whole set of skills around writing good documentation, and getting that expertise involved is a lot more helpful than merely going for people with less expertise.

Good tech writers are worth their weight in gold.


Copyright © 2022, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds