-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
bpo-29427: allow unpadded input and ouput in base64 module #7072
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
d6d4d93
allow parsing and base64
guillp dd69e85
updated base64 unittests for unpadded input/output
guillp 698ec49
added NEWS.d entry for bpo-29427
guillp 8eb9823
fixed b64decode on non padded strings
guillp df1102c
more base64 test cases for unpadded input
guillp 1b307ea
fixed indentation
guillp 77618ad
Update Misc/NEWS.d/next/Library/2018-05-23-18-02-03.bpo-29427.82cb18.rst
guillp df3d326
typos
guillp File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -48,34 +48,43 @@ def _bytes_from_decode_data(s): | |
|
||
# Base64 encoding/decoding uses binascii | ||
|
||
def b64encode(s, altchars=None): | ||
def b64encode(s, altchars=None, padded=True): | ||
"""Encode the bytes-like object s using Base64 and return a bytes object. | ||
|
||
Optional altchars should be a byte string of length 2 which specifies an | ||
alternative alphabet for the '+' and '/' characters. This allows an | ||
application to e.g. generate url or filesystem safe Base64 strings. | ||
|
||
If padded is True (the default), padding will be applied to the | ||
result bytes. If padding is False, no padding is applied. | ||
""" | ||
encoded = binascii.b2a_base64(s, newline=False) | ||
if altchars is not None: | ||
assert len(altchars) == 2, repr(altchars) | ||
return encoded.translate(bytes.maketrans(b'+/', altchars)) | ||
encoded = encoded.translate(bytes.maketrans(b'+/', altchars)) | ||
if not padded: | ||
encoded = encoded.rstrip(b'=') | ||
return encoded | ||
|
||
|
||
def b64decode(s, altchars=None, validate=False): | ||
def b64decode(s, altchars=None, validate=False, padded=True): | ||
"""Decode the Base64 encoded bytes-like object or ASCII string s. | ||
|
||
Optional altchars must be a bytes-like object or ASCII string of length 2 | ||
which specifies the alternative alphabet used instead of the '+' and '/' | ||
characters. | ||
|
||
The result is returned as a bytes object. A binascii.Error is raised if | ||
s is incorrectly padded. | ||
The result is returned as a bytes object. | ||
|
||
If validate is False (the default), characters that are neither in the | ||
normal base-64 alphabet nor the alternative alphabet are discarded prior | ||
to the padding check. If validate is True, these non-alphabet characters | ||
in the input result in a binascii.Error. | ||
|
||
If padded is True (the default), a binascii.Error is raised if s is | ||
incorrectly padded. If padded is False and validate is True, a | ||
binascii.Error will be raised if s contains padding. If both padded and | ||
validate are False, any eventual padding will be ignored. | ||
""" | ||
s = _bytes_from_decode_data(s) | ||
if altchars is not None: | ||
|
@@ -84,6 +93,10 @@ def b64decode(s, altchars=None, validate=False): | |
s = s.translate(bytes.maketrans(altchars, b'+/')) | ||
if validate and not re.match(b'^[A-Za-z0-9+/]*={0,2}$', s): | ||
raise binascii.Error('Non-base64 digit found') | ||
if not padded: | ||
if validate and not re.match(b'^[A-Za-z0-9+/]*$', s): | ||
raise binascii.Error('Padding found in supposedly non-padded input') | ||
s += b'==' | ||
return binascii.a2b_base64(s) | ||
|
||
|
||
|
@@ -108,29 +121,33 @@ def standard_b64decode(s): | |
_urlsafe_encode_translation = bytes.maketrans(b'+/', b'-_') | ||
_urlsafe_decode_translation = bytes.maketrans(b'-_', b'+/') | ||
|
||
def urlsafe_b64encode(s): | ||
def urlsafe_b64encode(s, validate=False, padded=True): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The |
||
"""Encode bytes using the URL- and filesystem-safe Base64 alphabet. | ||
|
||
Argument s is a bytes-like object to encode. The result is returned as a | ||
bytes object. The alphabet uses '-' instead of '+' and '_' instead of | ||
'/'. | ||
|
||
If padded is True (the default), the result is padded. If padded | ||
is False, the result will be left unpadded. | ||
""" | ||
return b64encode(s).translate(_urlsafe_encode_translation) | ||
return b64encode(s, padded=padded).translate(_urlsafe_encode_translation) | ||
|
||
def urlsafe_b64decode(s): | ||
def urlsafe_b64decode(s, padded=True): | ||
"""Decode bytes using the URL- and filesystem-safe Base64 alphabet. | ||
|
||
Argument s is a bytes-like object or ASCII string to decode. The result | ||
is returned as a bytes object. A binascii.Error is raised if the input | ||
is incorrectly padded. Characters that are not in the URL-safe base-64 | ||
alphabet, and are not a plus '+' or slash '/', are discarded prior to the | ||
padding check. | ||
is returned as a bytes object. Characters that are not in the URL-safe | ||
base-64 alphabet, and are not a plus '+' or slash '/', are discarded prior | ||
to the padding check. | ||
|
||
The alphabet uses '-' instead of '+' and '_' instead of '/'. | ||
|
||
Arguments padded and validate behave the same as in b64decode(). | ||
""" | ||
s = _bytes_from_decode_data(s) | ||
s = s.translate(_urlsafe_decode_translation) | ||
return b64decode(s) | ||
return b64decode(s, padded=padded) | ||
|
||
|
||
|
||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
3 changes: 3 additions & 0 deletions
3
Misc/NEWS.d/next/Library/2018-05-23-18-02-03.bpo-29427.82cb18.rst
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
Allow :func:`~base64.b64encode` and :func:`~base64.b64decode` (as well as derived | ||
:func:`~base64.urlsafe_b64encode` and :func:`~base64.urlsafe_b64decode`) from | ||
:mod:`base64` module to produce or accept unpadded input or output. |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why two ==s is always the right padding. I may be missing something.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably depends on the base64 flavor of what is accepted as "right" padding, but as far as I understand the
a2b_base64
implementation extra padding characters are just ignored, so always appending==
should be safe in this context.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks. Maybe worth putting that in a comment in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am using util from
django.utils.http.urlsafe_base64_encode
https://docs.djangoproject.com/en/3.1/ref/utils/#django.utils.http.urlsafe_base64_encode