Skip to content

bpo-30588: document codecs.escape_decode #14747

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

carlbordum
Copy link
Contributor

@carlbordum carlbordum commented Jul 13, 2019

@mangrisano
Copy link
Contributor

/cc @malemburg @doerwalter

@zooba
Copy link
Member

zooba commented Jul 14, 2019

I believe this is okay, but if @serhiy-storchaka and @asvetlov want to block it (as discussed in the bug), happy to let them overrule me :)

@doerwalter
Copy link
Contributor

IMHO, if we document this function, the added description shouldn't describe what a generic *_decode function does, it should decribe what escape_decode specifically does. I.e as it is now the first sentence is redundant, the second too vague.

Comment on lines +248 to +249
length consumed). This is useful for decoding ascii escape sequences mixed
with unicode characters.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is "mixed with unicode characters" supposed mean in this context? data is a bytes-like object, it can't contain unicode runes. We should include an example of what this does that is different than using one of the text encoding codecs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As Matthieu Dartiailh described on bugs.python.org, an example of ascii decode characters mixed with unicode is 'Δ\nΔ'.

Here is the difference:

>>> codecs.unicode_escape_decode(\nΔ')
(\x94\nÎ\x94', 5)
>>> codecs.escape_decode(\nΔ')
(b'\xce\x94\n\xce\x94', 5)
>>> codecs.escape_decode(\nΔ')[0].decode('utf-8')
\nΔ'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A real-world example. We can assume that many more homegrown 'solutions' exist.

Wouldn't it be great if kludgy, slow, error-prone workarounds people have come up with were replaced with something elegant and Python-worthy?

Please consider that this function is so rarely seen outside the Python developer world because it is kept almost a secret.

@csabella
Copy link
Contributor

@carlbordum please address the review comments. Thanks!

@@ -242,6 +242,13 @@ wider range of codecs when working with binary files:
:func:`iterencode`.


.. function:: escape_decode(data, errors=None)

Decode the bytes-like object *data* and return a tuple (decoded object,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"bytes-like object" seems incorrect; it accepts strings too.

@JelleZijlstra
Copy link
Member

Closing as it's been a few years and the feedback hasn't been addressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting review docs Documentation in the Doc dir skip news
Projects
None yet
Development

Successfully merging this pull request may close these issues.