Skip to content

gh-102494: fix MemoryError when using selectors on Solaris #102495

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

kulikjak
Copy link
Contributor

@kulikjak kulikjak commented Mar 7, 2023

When you set ulimit -n unlimited on Solaris and its derivatives (where /dev/poll is available) and import selectors, Python crashes with a MemoryError because there is no upper limit to the allocation size.

This fix adds an arbitrary limit of 2^18 (which results in roughly ~4MB of memory).

Fixes #102494

@arhadthedev arhadthedev added the stdlib Python modules in the Lib dir label Apr 2, 2023
@arhadthedev
Copy link
Member

@jcea (as a Solaris expert)

@jcea
Copy link
Member

jcea commented Apr 3, 2023

For reference, documentation at https://docs.oracle.com/cd/E19253-01/816-5177/6mbbc4g9n/index.html for example.

Notice that current array plays two roles:

  1. Provide registration and removal of file descriptor information to the kernel. This operations don't require a huge array, you just flush descriptor information when the buffer is full or just right before the "poll" operation. This is already implemented in the "devpoll_flush()" function.

  2. Get the result information from the "poll" operation. This requires being able to receive, in the worst case, all the registered file descriptors. Solaris will give back information up to "n" file descriptors if not enough space is available, but I don't know what would happen if these fds become active again. Maybe other file descriptors will starve, begging for service, but not receiving it because other active fds are notified and there is not enough space to receive all pending notifications at the same time (*).

Instead of allocating a huge array and clap it to a max size, I would suggest to keep a count of active file descriptors (registrations minus unregistrations) and resize the array as needed (maybe only growing it if needed, never shrinking it).

I would suggest an initial size of 1024, growing 25% when needed.

Shrinking would be nice too, but probably unimportant.

(*) Would be quite interesting to investigate the kernel implementation. I guess that playing with a handful of fds would be enough to determine if the kernel gives back the active fds in a round robin, random or "start from the beginning until you fill the buffer" way. Some comments at https://github.com/illumos/illumos-gate/blob/2c76d75129011c98e79463bb84917b828f922a11/usr/src/uts/common/io/devpoll.c#L237 suggest that Solaris kernel gives back the active descriptor in a round robin way, so actually using a small (1024 entries?) buffer would be fine enough. See also code around line 293 and 629. This assumption needs to be tested, but the intention seems quite clear.

If this assumption holds true and you want a highly concurrent/performant implementation, you could resize the array when the returned list of active file descriptors use the full array, signaling that a bigger array could reduce syscall traffic, although if you are dealing with Python, this kind of optimizations are probably overkill.

PS: I would support a "port" interface, although Solaris is not a tier-1 platform nowadays for Python.

@kulikjak
Copy link
Contributor Author

kulikjak commented May 9, 2023

Thank you for the detailed notes; I will look into it.

I've also recently hit another issue with the Devpoll selector when working on Cheroot (cherrypy/cheroot#561), so it needs some love.

@serhiy-storchaka
Copy link
Member

Any progress? I support @jcea's idea.

@kulikjak
Copy link
Contributor Author

Uh, I am sorry - this one's gone missing from my todo list...

Thanks for the review and extensive notes. Originally, I was under the assumption (I think - it's quite some time since I filled this) that we need as big of an array as is the number of watched descriptors, but we indeed don't.

I played with devpoll a little and the active file descriptors are indeed returned in a round robin way (at least on Oracle Solaris), so we don't need to worry about starvation.

I wonder whether we even need to resize the array. The performance would probably be slightly better in cases where devpoll.poll hits the upper limit often, but then we would likely need shrinking as well, and we probably don't want to expand/shrink based on a single devpoll.poll so we would need to watch n previous polls for the number of returned descriptors and make the decision to expand or shrink based on some average. That said, it can certainly be done.

We can also provide a new method to resize the buffer - most people would likely be happy with 1024, and those with a huge number of expected active descriptors can control the size of the returned tuple themselves.

@kulikjak
Copy link
Contributor Author

And as for the port interface, I'll keep that in mind. It certainly doesn't have the highest priority, and thus I don't know when I'll have some time to look into that, but it would be a nice addition. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting review stdlib Python modules in the Lib dir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MemoryError when using selectors on Solaris
5 participants