-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
Labels
Description
On any target which has native support for Float16
(e.g. Apple Silicon) you can see that power_by_squaring
is doing unnecessary promotion to higher precision (Float64
, not just Float32
):
julia> code_llvm((Float16,); debuginfo=:none) do (x) x ^ 2 end
; Function Signature: var"#9"(Float16)
define half @"julia_#9_2873"(half %"x::Float16") #0 {
top:
%0 = fpext half %"x::Float16" to double
%1 = call double @"j_#power_by_squaring#522_2882"(double %0, i16 signext 2)
%2 = fptrunc double %1 to float
%3 = fptrunc float %2 to half
ret half %3
}
julia> code_llvm((Float16,); debuginfo=:none) do (x) abs2(x) end
; Function Signature: var"#11"(Float16)
define half @"julia_#11_2885"(half %"x::Float16") #0 {
top:
%0 = fmul half %"x::Float16", %"x::Float16"
ret half %0
}
Edit: @jakobnissen pointed out this is happening around
Lines 1343 to 1349 in 912460b
function ^(x::Float32, n::Integer) | |
n == -2 && return (i=inv(x); i*i) | |
n == 3 && return x*x*x #keep compatibility with literal_pow | |
n < 0 && return Float32(Base.power_by_squaring(inv(Float64(x)),-n)) | |
Float32(Base.power_by_squaring(Float64(x),n)) | |
end | |
@inline ^(x::Float16, y::Integer) = Float16(Float32(x) ^ y) |
Float32
julia> code_llvm((Float32,); debuginfo=:none) do x x^2 end
; Function Signature: var"#59"(Float32)
define float @"julia_#59_13034"(float %"x::Float32") #0 {
top:
%0 = fmul float %"x::Float32", %"x::Float32"
ret float %0
}
@MasonProtter showed that's because
Line 370 in 912460b
@inline literal_pow(::typeof(^), x::HWNumber, ::Val{2}) = x*x |
Float16
isn't in Lines 362 to 363 in 912460b
const HWReal = Union{Int8,Int16,Int32,Int64,UInt8,UInt16,UInt32,UInt64,Float32,Float64} | |
const HWNumber = Union{HWReal, Complex{<:HWReal}, Rational{<:HWReal}} |
MasonProtter