You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I do not think these are urgent problems. However, the optimization may be useful in the future to speed up the accumulation (reduce). And I just found a (ugly) workaround for the constant division problem, so I am writing this issue as a memorandum or reminder.
The figures below visualize the cases where the optimization is available.
The positive (greenish) area means that the conversion is overflow-safe (i.e. there is no need for the range checking).
The negative (reddish) area means that the conversion is unsafe (i.e. it may throw the exception).
The deep-colored cells means that the conversion does not need floating-point operations.
As mentioned above, the f1 == f2lines are already supported.
You can get the result of other cases with the following script:
There is also the N0f8->N0f16 specialization. (I wonder why.)
I can't remember for sure, it seemed important at the time 😄. (Probably, performance.) But it's certainly more specific than needed. Julia's compiler was not what it is now, so the implementations in #138 probably would have resulted in a big branch.
And I agree, Normed->Normed conversions may be a good way to improve reduce. Kudos as always for looking into this!
As I suggested here, the current
Normed
->Normed
conversions are inefficient in some cases.FixedPointNumbers.jl/src/normed.jl
Lines 41 to 46 in 8d17739
The current conversion method has two problems:
The former means that the method is not SIMD-suitable.
Regarding the latter, the conversion between types with the same
f
is already specialized.FixedPointNumbers.jl/src/normed.jl
Line 13 in ee5bd54
There is also the
N0f8
->N0f16
specialization. (I wonder why.)FixedPointNumbers.jl/src/normed.jl
Line 47 in ee5bd54
I do not think these are urgent problems. However, the optimization may be useful in the future to speed up the accumulation (
reduce
). And I just found a (ugly) workaround for the constant division problem, so I am writing this issue as a memorandum or reminder.The figures below visualize the cases where the optimization is available.
f1 == f2
lines are already supported.You can get the result of other cases with the following script:
Edit:
The safe areas are wrong. I forgot to take account of the "carry" or "overlapping". I will soon fix the figures and the script above.Updated.The text was updated successfully, but these errors were encountered: