-
Notifications
You must be signed in to change notification settings - Fork 70
python string wrapper? #218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I've just done a quick benchmark with
So conversion to/from Python strings appears to be within a small factor of optimal. Is it that you don't want to pay this conversion cost at all, or have limited memory and therefore want a lazy wrapper? I'm not against adding We could have a |
I think the problem is that it doesn't scale well. I don't have a MWE but I have observed converting large sets of strings to be much more expensive. I think the reasons for this are the burden on the garbage collector and that the memory holding the strings is not in general contiguous as in your example with one large string. I'm aware that I'm not sure it's the right approach, it was just a thought. |
The main use for a wrapper is to provide a zero-copy interface to a mutable object. A secondary use is to access only a small portion of a large container. I can't think of more uses than these two. If you don't have either of these uses (i.e. you are only reading, and will read most of the container) then usually you're better off eagerly converting the container instead. Strings are immutable, which leaves only the second use, i.e. reading a small portion of a large string. But then you may as well just take the relevant substrings on the python side before converting. Maybe that's not always possible (e.g. this is happening inside a function which is acting generically on strings). |
https://discuss.python.org/t/pep-686-make-utf-8-mode-default/14435/43 I don't know when strings will internally be UTF-8 (as opposed to just for I/O), but I think they want to change that, in similar time-frame. |
That's interesting, if strings ever use UTF-8 internally then we could add a |
Converting strings from Python is of course really expensive because it involves a lot of copying and not even of contiguous blocks of data. Once you start getting above a few MB of data converting strings starts to look like a really bad option. The
Py
objects can do a lot, but they are notAbstractString
so they don't really look like strings on the Julia end until you convert them.Any interest in creating some kind of
PyStr
wrapper that provides anAbstractString
interface forPy
's?The text was updated successfully, but these errors were encountered: