-
Notifications
You must be signed in to change notification settings - Fork 51
More efficient implementation of integers #548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
As mentioned in python/cpython#101265, we need to prevent deallocation of small To do that we need to mark small ints. We could also use this mark to avoid |
Here's a possible layout with mark bit: typedef struct {
OBJECT_HEADER;
intptr_t tagged_bits;
} PyIntObject;
typedef struct {
PyIntObject header;
digit digits[1];
} PyLongObject; Tagged bits
This allows us to perform the common operations efficiently. C value of
|
One other operation that is a bit less common, but needs consideration is normalization of "longs" to "ints". For efficiency, it is probably OK to allow some values to be supported in both formats, and use a simple approximation in number of digits. However not all single digit ints fit in the tagged bits.
In other words, it is probably best not to have a normal form for values near the threshold, to allow efficient implementation. |
Given that we don't need ridiculously large ints, we could drop a further bit to allow immortal "longs". Tagged bits
|
Just a thought. Would it help to tweak the bits so that you can find the sign and/or immortality without bothering if it is short or long? If you put the sign in the high bit for longs, you would get the sign check for free. |
Maybe it's also time to reconsider dropping 15 and 30 bit digits and go with 63-bit digits only instead? |
(This remark was the result of me not realizing that it was about "digits" and not "numbers". Feel free to ignore.) |
I'm not joking! I've spent quite a bit of time last year looking over For instance, I don't think that 15-bit digits are that useful anymore, with 32-bit platforms becoming less and less relevant every passing CPU generation; even with only a 30-bit digit implementation, the performance loss on those platforms would be mostly negligible compared to the cleanup of using a single digit width. With time, moving from a 30-bit digit implementation to a 63-bit digit implementation should even improve performance a tad on 64-bit platforms -- by performing fewer iterations on most long operations and whatnot. It shouldn't increase memory usage significantly either; in most cases it should remain the same (e.g., a 1024-bit long would use 32 30-bit digits, and 16 63-bit digits, which would be 128 bytes either way). If moving to a 63-bit digit, we would not have something akin to Even if we decide that it's not yet time to get rid of 16- and 32-bit math in long integers, it might be an interesting exercise to at least remove the 16-bit math and keep only the 32-bit math. This would allow, for instance, using intrinsics such as Of course, this goes a bit into non-portable territory, but implementations of these functions can be written in either assembly for some major CPU arches (e.g. aarch64), or portable C as a fallback. |
(Maybe @jneb thought you were proposing to limit int size to 63 bits? If someone proposed this I'd also assume they were joking. :-) |
That makes a whole lot more sense, and I agree :D |
Speeding up arithmetic.Python int arithmetic is very slow. We need to do a lot of work, checking the size and shape of ints before adding them, and then yet more work boxing them. We want to minimize the amount of work we do. Many arithmetic expressions consume temporary values which we can reuse, if we discard "small ints" and use information that the compiler can gather to determine whether an operation is one of the forms:
In the first case, the refcount of The above is impossible with "small ints" because if the result is a small int, we need to use the immortal small object and free the temporary. We can still have small ints. There are useful in lots of places, but we need to get rid of the requirement that we must use them, that What the specialized form would look likeWe want the specialization of inst(BINARY_OP_ADD_INT_REUSE_LEFT, (unused/1, left, right -- sum)) {
assert(cframe.use_tracing == 0);
DEOPT_IF(!PyLong_CheckExact(left), BINARY_OP);
DEOPT_IF(Py_TYPE(right) != Py_TYPE(left), BINARY_OP);
DEOPT_IF(Py_REFCNT(left) != 1, BINARY_OP); /* Or two for the x += ... case */
DEOPT_IF((left->tagged_bits | right->tagged_bits) & 3, BINARY_OP); /* Both are mortal ints, not longs */
DEOPT_IF((add_overflows(left->tagged_bits, right->tagged_bits), BINARY_OP);
STAT_INC(BINARY_OP, hit);
left->tagged_bits = (left->tagged_bits + right->tagged_bits) & ~(1<<2);
_Py_DECREF_SPECIALIZED(right, (destructor)PyObject_Free);
} We attempted this before, but the code gets messy and very branchy handling all the special cases due to small ints, and refcounts. We will want a bit of help from the compiler, to mark which binary ops are of the form Prototyping on floatsWe can implement the reference counting specializations first for |
CPython issue: python/cpython#101291 |
For a slightly better longs representation you might consider this: https://ep2011.europython.eu/conference/talks/hacking-pylongobject-on-python-32.html The above work can be combined with the introduction of "direct" ints which I made son WPython 1.1: https://code.google.com/archive/p/wpython2/downloads |
For the sake of new comers (like me), what is the context of immortal ints? Are them known in advance (e.g. compile time)? In time, I was wondering whether immortality (for any object) could be stored out side of the integer bits, keeping them as close to C implementation as possible. I see two ways to store immortality status outside of the object:
Perhaps the pointer parity trick could be used for long integers sign as well. Does it make sense? |
Moving this from the old discussion to here, as we aren't using discussions any more.
The text was updated successfully, but these errors were encountered: