-
-
Notifications
You must be signed in to change notification settings - Fork 32.4k
bpo-30588: document codecs.escape_decode #14747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I believe this is okay, but if @serhiy-storchaka and @asvetlov want to block it (as discussed in the bug), happy to let them overrule me :) |
IMHO, if we document this function, the added description shouldn't describe what a generic |
length consumed). This is useful for decoding ascii escape sequences mixed | ||
with unicode characters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is "mixed with unicode characters" supposed mean in this context? data is a bytes-like object, it can't contain unicode runes. We should include an example of what this does that is different than using one of the text encoding codecs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As Matthieu Dartiailh described on bugs.python.org, an example of ascii decode characters mixed with unicode is 'Δ\nΔ'
.
Here is the difference:
>>> codecs.unicode_escape_decode('Δ\nΔ')
('Î\x94\nÎ\x94', 5)
>>> codecs.escape_decode('Δ\nΔ')
(b'\xce\x94\n\xce\x94', 5)
>>> codecs.escape_decode('Δ\nΔ')[0].decode('utf-8')
'Δ\nΔ'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A real-world example. We can assume that many more homegrown 'solutions' exist.
Wouldn't it be great if kludgy, slow, error-prone workarounds people have come up with were replaced with something elegant and Python-worthy?
Please consider that this function is so rarely seen outside the Python developer world because it is kept almost a secret.
@carlbordum please address the review comments. Thanks! |
@@ -242,6 +242,13 @@ wider range of codecs when working with binary files: | |||
:func:`iterencode`. | |||
|
|||
|
|||
.. function:: escape_decode(data, errors=None) | |||
|
|||
Decode the bytes-like object *data* and return a tuple (decoded object, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"bytes-like object" seems incorrect; it accepts strings too.
Closing as it's been a few years and the feedback hasn't been addressed. |
I gave it a shot :)
https://bugs.python.org/issue30588