Skip to content

[BUG] bytes constructor takes size as size_t, not ssize_t #2690

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
anntzer opened this issue Nov 23, 2020 · 8 comments
Closed

[BUG] bytes constructor takes size as size_t, not ssize_t #2690

anntzer opened this issue Nov 23, 2020 · 8 comments

Comments

@anntzer
Copy link
Contributor

anntzer commented Nov 23, 2020

Issue description

The py::bytes(char const*, size_t) constructor takes its input size as unsigned size_t, even though the underlying CPython API (https://docs.python.org/3/c-api/bytes.html#c.PyBytes_FromStringAndSize) uses signed ssize_t. See also #1599 for a similar case.

Reproducible example code

Use the following snippet into the setup provided in the pybind11/python_example repo:

#include <pybind11/pybind11.h>

namespace py = pybind11;

PYBIND11_MODULE(python_example, m) {
    ssize_t n = 3;
    m.attr("foo") = py::bytes{"foo", n};
}

Compilation (pybind11 2.6.0, gcc 10.2.0) yields

src/main.cpp: In function ‘void pybind11_init_python_example(pybind11::module_&)’:
src/main.cpp:7:38: warning: narrowing conversion of ‘n’ from ‘ssize_t’ {aka ‘long int’} to ‘pybind11::size_t’ {aka ‘long unsigned int’} [-Wnarrowing]
    7 |     m.attr("foo") = py::bytes{"foo", n};
@anntzer anntzer changed the title [BUG] array_t constructor takes shape as size_t, not ssize_t [BUG] bytes constructor takes shape as size_t, not ssize_t Nov 23, 2020
@anntzer anntzer changed the title [BUG] bytes constructor takes shape as size_t, not ssize_t [BUG] bytes constructor takes size as size_t, not ssize_t Nov 23, 2020
@YannickJadoul
Copy link
Collaborator

But you're passing the wrong type to pybind11? I don't see why 1) a length of a string should be possibly negative, and 2) why you would ant to pass a ssize_t here? As long as there is no warning internally (there isn't, pybind11 casts from size_t to ssize_t), wouldn't this cause only confusion?

@anntzer
Copy link
Contributor Author

anntzer commented Nov 23, 2020

pybind11 is just forwarding the arguments to PyBytes_FromStringAndSize, which takes a ssize_t (indeed, calling PyBytes_FromStringAndSize("foo", n); in the same example emits no warning). There are many places where you naturally get a ssize_t that you may want to forward to the bytes constructor (e.g. from the shape of a py::array_t).
See also https://www.python.org/dev/peps/pep-0353/#why-not-size-t, perhaps.

@YannickJadoul
Copy link
Collaborator

Yes, but pybind11 wraps the C API. Saying that PyBytes_FromStringAndSize is completely irrelevant and does not constitute an argument on why the pybind11 API should accept ssize_t?

There are many places where you naturally get a ssize_t that you may want to forward to the bytes constructor (e.g. from the shape of a py::array_t).

That ís an argument. I would kind of argue that's a wrong return type, but OK, nothing we can do about that.

See also https://www.python.org/dev/peps/pep-0353/#why-not-size-t, perhaps.

Well, fair enough, but this is not an index, but a size. So we don't need an index type, but a size type.

@anntzer
Copy link
Contributor Author

anntzer commented Nov 23, 2020

There's also the argument of consistency with py::array_t (whose constructor also takes signed sizes). You were apparently fine with that in #2293?

@YannickJadoul
Copy link
Collaborator

There's also the argument of consistency with py::array_t (whose constructor also takes signed sizes). You were apparently fine with that in #2293?

That was increasing consistency. Most things already were ssize_t in the array API (from ... I don't know, so while ago?), which makes slightly more sense because of negative indexing involved in the array interface.

To the contrary, constructing py::bytes from std::vector<char> and asking std::vector<T>::size() will give a size_t, and strlen() also return a size_t (which is actually a problem I regularly bump into with array as well, now that you mention it). So size_t is not a crazy type to take for the size of a bytes.

I'm not saying it might not be a good idea to add an overload, btw. I'm just not convinced by the argument that the API pybind11 wraps takes a different type, so pybind11 should have a different API (what's the point of wrapping an API, otherwise?), and definitely not by a warning about narrowing when you pass the wrong type to an API.

Guess an overload can't really hurt, though (can it?), if you feel like making a PR.

@anntzer
Copy link
Contributor Author

anntzer commented Nov 23, 2020

what's the point of wrapping an API, otherwise?

FWIW I mostly tend to view pybind11 as an (nice) RAII/autorefcounted version of the C-API. Changing ssize_t to size_t doesn't really help (me) on that point.

Guess an overload can't really hurt, though (can it?), if you feel like making a PR.

Sure, I'll push that on my todo list :)

Just one additional data point in favor of signed sizes: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1227r2.html.

@YannickJadoul
Copy link
Collaborator

Just one additional data point in favor of signed sizes: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1227r2.html.

Not sure that outweighs all of the already existing code and standard that uses/return size_t, so "just changing it" doesn't really seem feasible, to me.

@Skylion007
Copy link
Collaborator

Closed by new PR. #3219

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants