Skip to content

gh-99631: Add custom loads and dumps support for the shelve module #118065

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 75 commits into from
Jul 12, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
e269c09
Use custom loads & dumps instead of custom pickler & unpickler for Shelf
furkanonder Apr 18, 2024
3465149
Allow custom loads & dumps instead of custom pickler & unpickler for …
furkanonder Apr 18, 2024
44b9fa1
Update documentation for serializer and deserializred functions
furkanonder Apr 18, 2024
f2eed32
Update Doc/library/shelve.rst
furkanonder Apr 20, 2024
b3e5723
Update Doc/library/shelve.rst
furkanonder Apr 20, 2024
53d5557
Update documentation for serializer and deserializer functions
furkanonder Apr 21, 2024
d496eab
Merge branch 'main' into issue-99631-2
furkanonder Apr 21, 2024
1e295ba
Fix lines according to PEP-8
furkanonder Apr 21, 2024
2011baa
Merge branch 'issue-99631-2' of github.com:furkanonder/cpython into i…
furkanonder Apr 21, 2024
3cbabe9
Fix doc according to line 80
furkanonder Apr 21, 2024
67b7340
Merge branch 'main' into issue-99631-2
furkanonder Apr 22, 2024
c6b43e2
Fix inline emphasis issue in docs
furkanonder Apr 23, 2024
52a90f7
Merge branch 'issue-99631-2' of github.com:furkanonder/cpython into i…
furkanonder Apr 23, 2024
4f79cf6
Update the definition of the open function.
furkanonder Jul 13, 2024
798fdb2
Pass the serializer and serializer arguments of Shelf.__init__ of Bsd…
furkanonder Jul 14, 2024
bb1150d
Add unittests for BsdDbShelf
furkanonder Jul 14, 2024
98d841b
Merge branch 'issue-99631-2' of github.com:furkanonder/cpython into i…
furkanonder Jul 14, 2024
1159bb6
Update BsdDbShelf's set_location, last and first functions
furkanonder Jul 14, 2024
4b4f1b6
Update BsdDbShelf's next and previous functions
furkanonder Jul 14, 2024
bc399fa
Merge branch 'main' into issue-99631-2
furkanonder Jul 15, 2024
41448d3
Refer to shelve.open function for the deserializer and serializer arg…
furkanonder Jul 15, 2024
fbbe5ea
Refer to shelve.open function for the deserializer and serializer arg…
furkanonder Jul 15, 2024
fdd3e8e
Merge branch 'main' into issue-99631-2
furkanonder Jul 15, 2024
6823ef2
📜🤖 Added by blurb_it.
blurb-it[bot] Jul 16, 2024
2affece
Update the versionchanged statements
furkanonder Jul 16, 2024
da8bc91
Merge branch 'issue-99631-2' of github.com:furkanonder/cpython into i…
furkanonder Jul 16, 2024
6bfebee
change type of num2
furkanonder Jul 16, 2024
82d58a7
Add test_custom_incomplete_serializer_and_deserializer case
furkanonder Jul 17, 2024
7dca8b4
Merge branch 'main' into issue-99631-2
furkanonder Jul 17, 2024
5f97676
Specify that the Shelf, DbfilenameShelf and BsdDbShelf class's takes …
furkanonder Jul 24, 2024
048daee
And and update the versionchanged's text
furkanonder Jul 24, 2024
00837d0
Update the news entry
furkanonder Jul 24, 2024
1292963
Update the versionchanged's text
furkanonder Jul 24, 2024
3431920
Merge branch 'issue-99631-2' of github.com:furkanonder/cpython into i…
furkanonder Jul 24, 2024
97a6d7c
Add new testcases to other bytes objects
furkanonder Jul 24, 2024
3becbc8
Add new testcases to test custom serializer protocl
furkanonder Jul 24, 2024
3a5d6ed
Add new testcases to other bytes objects
furkanonder Jul 24, 2024
fb74832
Delete comma from document
furkanonder Jul 24, 2024
d670c95
Update the description of open function
furkanonder Jul 25, 2024
f2e22eb
sort the imports
furkanonder Jul 27, 2024
e00a52f
add white space
furkanonder Jul 27, 2024
3af3f97
Don't use f-string in type(obj).__name__
furkanonder Jul 27, 2024
9d232e5
Don't use f-string in type(obj).__name__
furkanonder Jul 27, 2024
26fc959
Don't use f-string in type(obj).__name__
furkanonder Jul 27, 2024
ab005aa
Set shelve class argument only serializer and deserializer
furkanonder Jul 27, 2024
6052309
Update shelveError message
furkanonder Jul 27, 2024
87b66d5
pass serializer and deserializer as keyword argument to DbfilenameShelf
furkanonder Jul 27, 2024
0c2f255
Remove unused import
furkanonder Jul 27, 2024
3db0c8e
Update shelve testcases
furkanonder Jul 27, 2024
5c39d94
Remove memoryview testcases
furkanonder Jul 28, 2024
1ca1801
Add ShelveError to shelve's __all__
furkanonder Jul 28, 2024
5a42de1
Add ShelveError to shelve documentation
furkanonder Jul 28, 2024
b0a5ee3
Add blank lines after versionadded and versionchanged
furkanonder Jul 29, 2024
4202ede
Remove white space in test_shelve
furkanonder Jul 29, 2024
2827eb4
Add test_custom_incomplete_serializer_and_deserializer_bsd_db_shelf
furkanonder Jul 29, 2024
9918531
Merge branch 'issue-99631-2' of github.com:furkanonder/cpython into i…
furkanonder Jul 29, 2024
54188bd
Update the serializer and deserializer functions
furkanonder Jul 29, 2024
786a248
Move os.mkdir and addCleanup functions beginning of the testcases
furkanonder Jul 29, 2024
20c2450
Use self.assertIsNone when checking None types
furkanonder Jul 29, 2024
b3770ae
change the test order
furkanonder Jul 29, 2024
588623a
Merge branch 'main' into issue-99631-2
furkanonder Apr 19, 2025
4d9599b
Update shelve module version references from 3.14 to 3.15
furkanonder May 30, 2025
9b204b7
Merge branch 'main' into issue-99631-2
furkanonder May 30, 2025
8b06918
Change shelve module version references from 3.15 to next
furkanonder May 30, 2025
b0f0bbc
Change shelve module version references from 3.15 to next
furkanonder May 30, 2025
d1bb227
refactor nested context managers for better readability
furkanonder May 30, 2025
791743b
simplify assertRaises calls in test_missing_custom_deserializer & tes…
furkanonder May 30, 2025
6b4be8b
refactor nested context managers for better readability
furkanonder May 30, 2025
34a32b9
Add type_name_len helper and use shorter variable names to reduce lin…
furkanonder May 30, 2025
bf6f3aa
Merge branch 'main' into issue-99631-2
furkanonder May 30, 2025
4b000cd
Improve the description of the open function
furkanonder Jun 2, 2025
2dcda2a
Update the description of ShelveError
furkanonder Jun 2, 2025
00bfb01
Simplify conditional branches in serializer and deserializer functions
furkanonder Jun 2, 2025
23ea842
Merge branch 'main' into issue-99631-2
furkanonder Jun 2, 2025
4df9b58
Merge branch 'main' into issue-99631-2
furkanonder Jun 11, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 61 additions & 9 deletions Doc/library/shelve.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,8 @@ This includes most class instances, recursive data types, and objects containing
lots of shared sub-objects. The keys are ordinary strings.


.. function:: open(filename, flag='c', protocol=None, writeback=False)
.. function:: open(filename, flag='c', protocol=None, writeback=False, *, \
serializer=None, deserializer=None)

Open a persistent dictionary. The filename specified is the base filename for
the underlying database. As a side-effect, an extension may be added to the
Expand All @@ -41,13 +42,32 @@ lots of shared sub-objects. The keys are ordinary strings.
determine which accessed entries are mutable, nor which ones were actually
mutated).

By default, :mod:`shelve` uses :func:`pickle.dumps` and :func:`pickle.loads`
for serializing and deserializing. This can be changed by supplying
*serializer* and *deserializer*, respectively.

The *serializer* argument must be a callable which takes an object ``obj``
and the *protocol* as inputs and returns the representation ``obj`` as a
:term:`bytes-like object`; the *protocol* value may be ignored by the
serializer.

The *deserializer* argument must be callable which takes a serialized object
given as a :class:`bytes` object and returns the corresponding object.

A :exc:`ShelveError` is raised if *serializer* is given but *deserializer*
is not, or vice-versa.

.. versionchanged:: 3.10
:const:`pickle.DEFAULT_PROTOCOL` is now used as the default pickle
protocol.

.. versionchanged:: 3.11
Accepts :term:`path-like object` for filename.

.. versionchanged:: next
Accepts custom *serializer* and *deserializer* functions in place of
:func:`pickle.dumps` and :func:`pickle.loads`.

.. note::

Do not rely on the shelf being closed automatically; always call
Expand Down Expand Up @@ -129,7 +149,8 @@ Restrictions
explicitly.


.. class:: Shelf(dict, protocol=None, writeback=False, keyencoding='utf-8')
.. class:: Shelf(dict, protocol=None, writeback=False, \
keyencoding='utf-8', *, serializer=None, deserializer=None)

A subclass of :class:`collections.abc.MutableMapping` which stores pickled
values in the *dict* object.
Expand All @@ -147,6 +168,9 @@ Restrictions
The *keyencoding* parameter is the encoding used to encode keys before they
are used with the underlying dict.

The *serializer* and *deserializer* parameters have the same interpretation
as in :func:`~shelve.open`.

A :class:`Shelf` object can also be used as a context manager, in which
case it will be automatically closed when the :keyword:`with` block ends.

Expand All @@ -161,8 +185,13 @@ Restrictions
:const:`pickle.DEFAULT_PROTOCOL` is now used as the default pickle
protocol.

.. versionchanged:: next
Added the *serializer* and *deserializer* parameters.

.. class:: BsdDbShelf(dict, protocol=None, writeback=False, keyencoding='utf-8')

.. class:: BsdDbShelf(dict, protocol=None, writeback=False, \
keyencoding='utf-8', *, \
serializer=None, deserializer=None)

A subclass of :class:`Shelf` which exposes :meth:`!first`, :meth:`!next`,
:meth:`!previous`, :meth:`!last` and :meth:`!set_location` methods.
Expand All @@ -172,18 +201,27 @@ Restrictions
modules. The *dict* object passed to the constructor must support those
methods. This is generally accomplished by calling one of
:func:`!bsddb.hashopen`, :func:`!bsddb.btopen` or :func:`!bsddb.rnopen`. The
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, we also need to update this sentence (from bsddb to berkeleydb). bsddb is deprecated according to https://www.jcea.es/programacion/pybsddb.htm.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the change would be bigger: it’s a new module (although the docs are not very clear) with maybe a new API.

Updating or deprecating this should be discussed in its own ticket 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree on it. It would be better to open a new ticket to discuss this issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... or consider instead opening topic on Discourse; this may need a larger audience than what you'd get on the bug tracker.

optional *protocol*, *writeback*, and *keyencoding* parameters have the same
interpretation as for the :class:`Shelf` class.
optional *protocol*, *writeback*, *keyencoding*, *serializer* and *deserializer*
parameters have the same interpretation as in :func:`~shelve.open`.

.. versionchanged:: next
Added the *serializer* and *deserializer* parameters.


.. class:: DbfilenameShelf(filename, flag='c', protocol=None, writeback=False)
.. class:: DbfilenameShelf(filename, flag='c', protocol=None, \
writeback=False, *, serializer=None, \
deserializer=None)

A subclass of :class:`Shelf` which accepts a *filename* instead of a dict-like
object. The underlying file will be opened using :func:`dbm.open`. By
default, the file will be created and opened for both read and write. The
optional *flag* parameter has the same interpretation as for the :func:`.open`
function. The optional *protocol* and *writeback* parameters have the same
interpretation as for the :class:`Shelf` class.
optional *flag* parameter has the same interpretation as for the
:func:`.open` function. The optional *protocol*, *writeback*, *serializer*
and *deserializer* parameters have the same interpretation as in
:func:`~shelve.open`.

.. versionchanged:: next
Added the *serializer* and *deserializer* parameters.


.. _shelve-example:
Expand Down Expand Up @@ -225,6 +263,20 @@ object)::
d.close() # close it


Exceptions
----------

.. exception:: ShelveError

Exception raised when one of the arguments *deserializer* and *serializer*
is missing in the :func:`~shelve.open`, :class:`Shelf`, :class:`BsdDbShelf`
and :class:`DbfilenameShelf`.

The *deserializer* and *serializer* arguments must be given together.

.. versionadded:: next


.. seealso::

Module :mod:`dbm`
Expand Down
64 changes: 38 additions & 26 deletions Lib/shelve.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,12 +56,17 @@
the persistent dictionary on disk, if feasible).
"""

from pickle import DEFAULT_PROTOCOL, Pickler, Unpickler
from pickle import DEFAULT_PROTOCOL, dumps, loads
from io import BytesIO

import collections.abc

__all__ = ["Shelf", "BsdDbShelf", "DbfilenameShelf", "open"]
__all__ = ["ShelveError", "Shelf", "BsdDbShelf", "DbfilenameShelf", "open"]


class ShelveError(Exception):
pass


class _ClosedDict(collections.abc.MutableMapping):
'Marker for a closed dict. Access attempts raise a ValueError.'
Expand All @@ -82,7 +87,7 @@ class Shelf(collections.abc.MutableMapping):
"""

def __init__(self, dict, protocol=None, writeback=False,
keyencoding="utf-8"):
keyencoding="utf-8", *, serializer=None, deserializer=None):
self.dict = dict
if protocol is None:
protocol = DEFAULT_PROTOCOL
Expand All @@ -91,6 +96,16 @@ def __init__(self, dict, protocol=None, writeback=False,
self.cache = {}
self.keyencoding = keyencoding

if serializer is None and deserializer is None:
self.serializer = dumps
self.deserializer = loads
elif (serializer is None) ^ (deserializer is None):
raise ShelveError("serializer and deserializer must be "
"defined together")
else:
self.serializer = serializer
self.deserializer = deserializer

def __iter__(self):
for k in self.dict.keys():
yield k.decode(self.keyencoding)
Expand All @@ -110,19 +125,17 @@ def __getitem__(self, key):
try:
value = self.cache[key]
except KeyError:
f = BytesIO(self.dict[key.encode(self.keyencoding)])
value = Unpickler(f).load()
f = self.dict[key.encode(self.keyencoding)]
value = self.deserializer(f)
if self.writeback:
self.cache[key] = value
return value

def __setitem__(self, key, value):
if self.writeback:
self.cache[key] = value
f = BytesIO()
p = Pickler(f, self._protocol)
p.dump(value)
self.dict[key.encode(self.keyencoding)] = f.getvalue()
serialized_value = self.serializer(value, self._protocol)
self.dict[key.encode(self.keyencoding)] = serialized_value

def __delitem__(self, key):
del self.dict[key.encode(self.keyencoding)]
Expand Down Expand Up @@ -191,33 +204,29 @@ class BsdDbShelf(Shelf):
"""

def __init__(self, dict, protocol=None, writeback=False,
keyencoding="utf-8"):
Shelf.__init__(self, dict, protocol, writeback, keyencoding)
keyencoding="utf-8", *, serializer=None, deserializer=None):
Shelf.__init__(self, dict, protocol, writeback, keyencoding,
serializer=serializer, deserializer=deserializer)

def set_location(self, key):
(key, value) = self.dict.set_location(key)
f = BytesIO(value)
return (key.decode(self.keyencoding), Unpickler(f).load())
return (key.decode(self.keyencoding), self.deserializer(value))

def next(self):
(key, value) = next(self.dict)
f = BytesIO(value)
return (key.decode(self.keyencoding), Unpickler(f).load())
return (key.decode(self.keyencoding), self.deserializer(value))

def previous(self):
(key, value) = self.dict.previous()
f = BytesIO(value)
return (key.decode(self.keyencoding), Unpickler(f).load())
return (key.decode(self.keyencoding), self.deserializer(value))

def first(self):
(key, value) = self.dict.first()
f = BytesIO(value)
return (key.decode(self.keyencoding), Unpickler(f).load())
return (key.decode(self.keyencoding), self.deserializer(value))

def last(self):
(key, value) = self.dict.last()
f = BytesIO(value)
return (key.decode(self.keyencoding), Unpickler(f).load())
return (key.decode(self.keyencoding), self.deserializer(value))


class DbfilenameShelf(Shelf):
Expand All @@ -227,9 +236,11 @@ class DbfilenameShelf(Shelf):
See the module's __doc__ string for an overview of the interface.
"""

def __init__(self, filename, flag='c', protocol=None, writeback=False):
def __init__(self, filename, flag='c', protocol=None, writeback=False, *,
serializer=None, deserializer=None):
import dbm
Shelf.__init__(self, dbm.open(filename, flag), protocol, writeback)
Shelf.__init__(self, dbm.open(filename, flag), protocol, writeback,
serializer=serializer, deserializer=deserializer)

def clear(self):
"""Remove all items from the shelf."""
Expand All @@ -238,8 +249,8 @@ def clear(self):
self.cache.clear()
self.dict.clear()


def open(filename, flag='c', protocol=None, writeback=False):
def open(filename, flag='c', protocol=None, writeback=False, *,
serializer=None, deserializer=None):
"""Open a persistent dictionary for reading and writing.

The filename parameter is the base filename for the underlying
Expand All @@ -252,4 +263,5 @@ def open(filename, flag='c', protocol=None, writeback=False):
See the module's __doc__ string for an overview of the interface.
"""

return DbfilenameShelf(filename, flag, protocol, writeback)
return DbfilenameShelf(filename, flag, protocol, writeback,
serializer=serializer, deserializer=deserializer)
Loading
Loading