-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
bpo-29427: allow unpadded input and ouput in base64 module #7072
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Allow base64.b64decode() to accept unpadded input instead of throwing a binascii.Error. Padding check is switchable using the new `padded` boolean argument, and is enabled by default to maintain backward compatibility. In a symmetric fashion, allow base64.b64encode() to produce unpadded output, using the `padded` argument. Default output will include padding for backward compatibility.
Added some test cases for base64 module, to test optional padding in b64decode() and b64encode(), urlsafe_b64encode() and urlsafe_b64decode().
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA). Unfortunately our records indicate you have not signed the CLA. For legal reasons we need you to sign this before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue. When your account is ready, please add a comment in this pull request Thanks again to your contribution and we look forward to looking at it! |
Good catch, this indeed breaks when the string to decode contains non-base64 characters, which would be ignored by binascii.a2b_decode(). If you agree, I'll push a fix including all that. BTW, I signed the CLA. |
Fixes b64decode on non padded input if that input contains invalid characters. Implements validate=True for non padded input.
Any news? |
@guillp there seem to be some whitespace issues (according to Travis). Could you fix those up? I'd love to have this feature! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that documentation should be updated.
Misc/NEWS.d/next/Library/2018-05-23-18-02-03.bpo-29427.82cb18.rst
Outdated
Show resolved
Hide resolved
What's the holdup? Can't the maintainer edit the PR if @guillp seems to have forgotten about it? |
updated commit description Co-Authored-By: Sergey Fedoseev <[email protected]>
Co-Authored-By: Sergey Fedoseev <[email protected]>
Very sorry about that. git push to github.com is not allowed from my workplace, so I add to commit manually from github.com UI which is a pain. Thanks to @sir-sigurd for the review. |
I hope this is not dead, it would've been useful for me. |
if not padded: | ||
if validate and not re.match(b'^[A-Za-z0-9+/]*$', s): | ||
raise binascii.Error('Padding found in supposedly non-padded input') | ||
s += b'==' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why two ==s is always the right padding. I may be missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably depends on the base64 flavor of what is accepted as "right" padding, but as far as I understand the a2b_base64
implementation extra padding characters are just ignored, so always appending ==
should be safe in this context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks. Maybe worth putting that in a comment in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am using util from django.utils.http.urlsafe_base64_encode
https://docs.djangoproject.com/en/3.1/ref/utils/#django.utils.http.urlsafe_base64_encode
@@ -108,29 +121,33 @@ def standard_b64decode(s): | |||
_urlsafe_encode_translation = bytes.maketrans(b'+/', b'-_') | |||
_urlsafe_decode_translation = bytes.maketrans(b'-_', b'+/') | |||
|
|||
def urlsafe_b64encode(s): | |||
def urlsafe_b64encode(s, validate=False, padded=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The validate
arg isn't used. And if it's going to be added it should be added to urlsafe_b64decode
instead.
Is there anything that can be done to push this change forward? Should a new PR be submitted if the original contributor has lost interest? |
Allow base64.b64decode() to accept unpadded input instead of throwing a binascii.Error. Padding check is switchable using the new
padded
boolean argument, and is enabled by default to maintain backward compatibility.In a symmetric fashion, allow base64.b64encode() to produce unpadded output, using the
padded
argument. Default output will include padding for backward compatibility.Updated test_base64 unittests to include new tests for optional padding.
https://bugs.python.org/issue29427