You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The context is when returning indices and/or counts of array elements the unique_*() APIs may have to promote the data type of the returned array (=default integer type) depending on the situation, so they should have been using "default array index data type" instead.
IIUC the term "default array index data type" is still referred to by some functions like argsort(), but
Originally, when writing the unique specification, the output dtype was the "default index data type". I wonder if we need to revive that distinction. Namely, that an array library should have three default data types:
floating-point data type
integer data type
index data type
Here, having a default index data type makes sense, as counts should align accordingly (i.e., a count should never exceed the maximum array index).
Furthermore, while it may often be the case that indices will have the same dtype as the default integer dtype, this need not be the case. For example, indices may be int64, while the default integer dtype could be int32 due to better target hardware support (e.g., GPUs).
The text was updated successfully, but these errors were encountered:
Relevant part of NumPy docs here: "The native NumPy indexing type is intp and may differ from the default integer array type. intp is the smallest data type sufficient to safely index any array; for advanced indexing it may be faster than other types."
This has always been quite confusing. NumPy doesn't really explain it in its docs, and without making it a first-class concept and just saying "returns integer array" it will be unclear for example in what cases a function returns 32-bit integers rather than 64-bit.
I could be wrong but the discussion around its behavior (specifically, when to promote) is still missing
I would recommend that it does not become a separate type, rather than it's either int32 or int64. With the only difference being that the "indexing default" may be something like "int32 on 32-bit platforms, 64-bit otherwise".
If it's done like that, no separate casting rules are needed.
The context is when returning indices and/or counts of array elements the
unique_*()
APIs may have to promote the data type of the returned array (=default integer type) depending on the situation, so they should have been using "default array index data type" instead.IIUC the term "default array index data type" is still referred to by some functions like
argsort()
, butCopying @kgryte from #317 (comment):
The text was updated successfully, but these errors were encountered: