-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Cast bytearray to string #3707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Cast bytearray to string #3707
Changes from all commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
772e22a
Add bytearray to string cast, testcase and rename load_bytes to load_raw
porrashuang 521e570
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] ac37829
Merge branch 'master' into bytearray_cast
porrashuang 4820aef
New bytearray test case and convert failure to pybind11_fail
porrashuang 5f06a7e
Merge https://github.com/kururu002/pybind11 into bytearray_cast
095836a
Merge with latest
porrashuang 5d0b3b7
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 4e18f82
Fix merge comments
porrashuang a57a305
Fix merge comments
porrashuang c64beb2
Actually fix merge comments
porrashuang b8d3b6a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 19d0124
Assert early if AsString fails
porrashuang File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add tests with empty
bytearray
s?And with malformed utf-8, to prove that there really isn't any encoding (see e.g.
malformed_utf8
in test_pytypes.py).I looked at the Python C code underneath
PyByteArray_AsString
, it special-cases empty arrays. But in any case, tests for corner cases are best practice.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, an empty bytearray test case will be added.
For malformed_utf8, I'm not sure it's the same case as casting to python str. In test_pytypes, it's casting a single byte to str, and str is not fulfilled with a utf-8 resulting "b'\x80'". While passing b"\x80" to C++ string it would be treated as a byte and has size 1. Is this the case you're looking for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The motivation/idea is very simple and high-level:
We know the current implementation does. We want to be sure that's not accidentally broken somehow (e.g. refactoring).
My suggestion was based on past experience, where that actually happened.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point, the test is added. I'm trying to make sure
assert m.string_length(bytearray(b"\x80")) == 1
is the behavior we're looking for 😊There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it just be better to parameterize the bytes_to_str tests and have them be rerun with all the bytes args wrapped as bytearrays(via pytest fixture for instance? or by having a function that either returns the bytes or the bytes wrapped in a bytearray depending on test parameters)
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Skylion007, I also came up wrapping up the params for better coverage. Considering there're at least 3 dimensions (bytearray vs bytes, string_lenth vs strlen, bytearray with/without params), and a few corner cases(empty bytearray), I would suggest the current intuitive version should give us similar coverage with simplicity. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.