Skip to content

Avoid mutation on user x #280

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 29, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 12 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -282,7 +282,7 @@ H = numauto_color_hessian(f, x, colorvec, sparsity)
numauto_color_hessian!(H, f, x, colorvec, sparsity)
```

To avoid unnecessary allocations every time the Hessian is computed,
To avoid unnecessary allocations every time the Hessian is computed,
construct a `ForwardColorHesCache` beforehand:

```julia
Expand All @@ -292,7 +292,7 @@ numauto_color_hessian!(H, f, x, hescache)

By default, these methods use a mix of numerical and automatic differentiation,
namely by taking finite differences of gradients calculated via ForwardDiff.jl.
Alternatively, if you have your own custom gradient function `g!`, you can specify
Alternatively, if you have your own custom gradient function `g!`, you can specify
it as an argument to `ForwardColorHesCache`:

```julia
Expand All @@ -301,7 +301,6 @@ hescache = ForwardColorHesCache(f, x, colorvec, sparsity, g!)
Note that any user-defined gradient needs to have the signature `g!(G, x)`,
i.e. updating the gradient `G` in place.


### Jacobian-Vector and Hessian-Vector Products

Matrix-free implementations of Jacobian-Vector and Hessian-Vector products is
Expand All @@ -322,7 +321,8 @@ auto_jacvec(f, x, v)

# If compute_f0 is false, then `f(cache1,x)` will be computed
num_jacvec!(dy,f,x,v,cache1 = similar(v),
cache2 = similar(v);
cache2 = similar(v),
cache3 = similar(v);
compute_f0 = true)
num_jacvec(f,x,v,f0=nothing)
```
Expand All @@ -333,14 +333,16 @@ For Hessians, the following are provided:
num_hesvec!(dy,f,x,v,
cache1 = similar(v),
cache2 = similar(v),
cache3 = similar(v))
cache3 = similar(v),
cache4 = similar(v))

num_hesvec(f,x,v)

numauto_hesvec!(dy,f,x,v,
cache = ForwardDiff.GradientConfig(f,v),
cache1 = similar(v),
cache2 = similar(v))
cache2 = similar(v),
cache3 = similar(v))

numauto_hesvec(f,x,v)

Expand All @@ -358,6 +360,7 @@ respectively:

```julia
num_hesvecgrad!(dy,g,x,v,
cache1 = similar(v),
cache2 = similar(v),
cache3 = similar(v))

Expand All @@ -384,7 +387,8 @@ using Zygote # Required

numback_hesvec!(dy,f,x,v,
cache1 = similar(v),
cache2 = similar(v))
cache2 = similar(v),
cache3 = similar(v))

numback_hesvec(f,x,v)

Expand All @@ -396,7 +400,7 @@ autoback_hesvec!(dy,f,x,v,
autoback_hesvec(f,x,v)
```

#### J*v and H*v Operators
#### `J*v` and `H*v` Operators

The following produce matrix-free operators which are used for calculating
Jacobian-vector and Hessian-vector products where the differentiation takes
Expand Down
22 changes: 13 additions & 9 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -244,7 +244,7 @@ forwarddiff_color_jacobian!(J::AbstractMatrix{<:Number},
`dx` is a pre-allocated output vector which is used to declare the output size,
and thus allows for specifying a non-square Jacobian.

Also, it is possible retrieve the function value via `value(jac_cache)` or
Also, it is possible retrieve the function value via `value(jac_cache)` or
`value!(result, jac_cache)`


Expand Down Expand Up @@ -276,7 +276,7 @@ H = numauto_color_hessian(f, x, colorvec, sparsity)
numauto_color_hessian!(H, f, x, colorvec, sparsity)
```

To avoid unnecessary allocations every time the Hessian is computed,
To avoid unnecessary allocations every time the Hessian is computed,
construct a `ForwardColorHesCache` beforehand:

```julia
Expand All @@ -286,7 +286,7 @@ numauto_color_hessian!(H, f, x, hescache)

By default, these methods use a mix of numerical and automatic differentiation,
namely by taking finite differences of gradients calculated via ForwardDiff.jl.
Alternatively, if you have your own custom gradient function `g!`, you can specify
Alternatively, if you have your own custom gradient function `g!`, you can specify
it as an argument to `ForwardColorHesCache`:

```julia
Expand All @@ -295,7 +295,6 @@ hescache = ForwardColorHesCache(f, x, colorvec, sparsity, g!)
Note that any user-defined gradient needs to have the signature `g!(G, x)`,
i.e. updating the gradient `G` in place.


### Jacobian-Vector and Hessian-Vector Products

Matrix-free implementations of Jacobian-Vector and Hessian-Vector products is
Expand All @@ -316,7 +315,8 @@ auto_jacvec(f, x, v)

# If compute_f0 is false, then `f(cache1,x)` will be computed
num_jacvec!(dy,f,x,v,cache1 = similar(v),
cache2 = similar(v);
cache2 = similar(v),
cache3 = similar(v);
compute_f0 = true)
num_jacvec(f,x,v,f0=nothing)
```
Expand All @@ -327,14 +327,16 @@ For Hessians, the following are provided:
num_hesvec!(dy,f,x,v,
cache1 = similar(v),
cache2 = similar(v),
cache3 = similar(v))
cache3 = similar(v),
cache4 = similar(v))

num_hesvec(f,x,v)

numauto_hesvec!(dy,f,x,v,
cache = ForwardDiff.GradientConfig(f,v),
cache1 = similar(v),
cache2 = similar(v))
cache2 = similar(v),
cache3 = similar(v))

numauto_hesvec(f,x,v)

Expand All @@ -352,6 +354,7 @@ respectively:

```julia
num_hesvecgrad!(dy,g,x,v,
cache1 = similar(v),
cache2 = similar(v),
cache3 = similar(v))

Expand All @@ -378,7 +381,8 @@ using Zygote # Required

numback_hesvec!(dy,f,x,v,
cache1 = similar(v),
cache2 = similar(v))
cache2 = similar(v),
cache3 = similar(v))

numback_hesvec(f,x,v)

Expand All @@ -390,7 +394,7 @@ autoback_hesvec!(dy,f,x,v,
autoback_hesvec(f,x,v)
```

#### J*v and H*v Operators
#### `J*v` and `H*v` Operators

The following produce matrix-free operators which are used for calculating
Jacobian-vector and Hessian-vector products where the differentiation takes
Expand Down
11 changes: 5 additions & 6 deletions ext/SparseDiffToolsZygoteExt.jl
Original file line number Diff line number Diff line change
Expand Up @@ -45,18 +45,17 @@ end

### Jac, Hes products

function numback_hesvec!(dy, f::F, x, v, cache1 = similar(v), cache2 = similar(v)) where {F}
function numback_hesvec!(dy, f::F, x, v, cache1 = similar(v), cache2 = similar(v), cache3 = similar(v)) where {F}
g = let f = f
(dx, x) -> dx .= first(Zygote.gradient(f, x))
end
T = eltype(x)
# Should it be min? max? mean?
ϵ = sqrt(eps(real(T))) * max(one(real(T)), abs(norm(x)))
@. x += ϵ * v
g(cache1, x)
@. x -= 2ϵ * v
g(cache2, x)
@. x += ϵ * v
@. cache3 = x + ϵ * v
g(cache1, cache3)
@. cache3 = x - ϵ * v
g(cache2, cache3)
@. dy = (cache1 - cache2) / (2ϵ)
end

Expand Down
42 changes: 19 additions & 23 deletions src/differentiation/jaches_products.jl
Original file line number Diff line number Diff line change
Expand Up @@ -32,16 +32,15 @@ function auto_jacvec(f, x, v)
vec(partials.(vec(f(y)), 1))
end

function num_jacvec!(dy, f, x, v, cache1 = similar(v), cache2 = similar(v);
function num_jacvec!(dy, f, x, v, cache1 = similar(v), cache2 = similar(v), cache3 = similar(v);
compute_f0 = true)
vv = reshape(v, axes(x))
compute_f0 && (f(cache1, x))
T = eltype(x)
# Should it be min? max? mean?
ϵ = sqrt(eps(real(T))) * max(one(real(T)), abs(norm(x)))
@. x += ϵ * vv
f(cache2, x)
@. x -= ϵ * vv
@. cache3 = x + ϵ * vv
f(cache2, cache3)
vecdy = _vec(dy)
veccache1 = _vec(cache1)
veccache2 = _vec(cache2)
Expand All @@ -58,19 +57,18 @@ function num_jacvec(f, x, v, f0 = nothing)
end

function num_hesvec!(dy, f, x, v, cache1 = similar(v), cache2 = similar(v),
cache3 = similar(v))
cache3 = similar(v), cache4 = similar(v))
cache = FiniteDiff.GradientCache(v[1], cache1, Val{:central})
g = let f = f, cache = cache
(dx, x) -> FiniteDiff.finite_difference_gradient!(dx, f, x, cache)
end
T = eltype(x)
# Should it be min? max? mean?
ϵ = sqrt(eps(real(T))) * max(one(real(T)), abs(norm(x)))
@. x += ϵ * v
g(cache2, x)
@. x -= 2ϵ * v
g(cache3, x)
@. x += ϵ * v
@. cache4 = x + ϵ * v
g(cache2, cache4)
@. cache4 = x - ϵ * v
g(cache3, cache4)
@. dy = (cache2 - cache3) / (2ϵ)
end

Expand All @@ -87,18 +85,17 @@ function num_hesvec(f, x, v)
end

function numauto_hesvec!(dy, f, x, v, cache = ForwardDiff.GradientConfig(f, v),
cache1 = similar(v), cache2 = similar(v))
cache1 = similar(v), cache2 = similar(v), cache3 = similar(v))
g = let f = f, x = x, cache = cache
g = (dx, x) -> ForwardDiff.gradient!(dx, f, x, cache)
end
T = eltype(x)
# Should it be min? max? mean?
ϵ = sqrt(eps(real(T))) * max(one(real(T)), abs(norm(x)))
@. x += ϵ * v
g(cache1, x)
@. x -= 2ϵ * v
g(cache2, x)
@. x += ϵ * v
@. cache3 = x + ϵ * v
g(cache1, cache3)
@. cache3 = x - ϵ * v
g(cache2, cache3)
@. dy = (cache1 - cache2) / (2ϵ)
end

Expand Down Expand Up @@ -137,16 +134,15 @@ function autonum_hesvec(f, x, v)
partials.(g(Dual{DeivVecTag}.(x, v)), 1)
end

function num_hesvecgrad!(dy, g, x, v, cache2 = similar(v), cache3 = similar(v))
function num_hesvecgrad!(dy, g, x, v, cache1 = similar(v), cache2 = similar(v), cache3 = similar(v))
T = eltype(x)
# Should it be min? max? mean?
ϵ = sqrt(eps(real(T))) * max(one(real(T)), abs(norm(x)))
@. x += ϵ * v
g(cache2, x)
@. x -= 2ϵ * v
g(cache3, x)
@. x += ϵ * v
@. dy = (cache2 - cache3) / (2ϵ)
@. cache3 = x + ϵ * v
g(cache1, cache3)
@. cache3 = x - ϵ * v
g(cache2, cache3)
@. dy = (cache1 - cache2) / (2ϵ)
end

function num_hesvecgrad(g, x, v)
Expand Down
16 changes: 9 additions & 7 deletions src/differentiation/vecjac_products.jl
Original file line number Diff line number Diff line change
@@ -1,14 +1,15 @@
function num_vecjac!(du, f::F, x, v, cache1 = similar(v), cache2 = similar(v);
function num_vecjac!(du, f::F, x, v, cache1 = similar(v), cache2 = similar(v), cache3 = similar(v);
compute_f0 = true) where {F}
compute_f0 && (f(cache1, x))
T = eltype(x)
# Should it be min? max? mean?
ϵ = sqrt(eps(real(T))) * max(one(real(T)), abs(norm(x)))
vv = reshape(v, size(cache1))
cache3 .= x
for i in 1:length(x)
x[i] += ϵ
f(cache2, x)
x[i] -= ϵ
cache3[i] += ϵ
f(cache2, cache3)
cache3[i] = x[i]
du[i] = (((cache2 .- cache1) ./ ϵ)' * vv)[1]
end
return du
Expand All @@ -21,10 +22,11 @@ function num_vecjac(f::F, x, v, f0 = nothing) where {F}
# Should it be min? max? mean?
ϵ = sqrt(eps(real(T))) * max(one(real(T)), abs(norm(x)))
du = similar(x)
cache = copy(x)
for i in 1:length(x)
x[i] += ϵ
cache[i] += ϵ
f0 = f(x)
x[i] -= ϵ
cache[i] = x[i]
du[i] = (((f0 .- _f0) ./ ϵ)' * vv)[1]
end
return vec(du)
Expand Down Expand Up @@ -91,7 +93,7 @@ function VecJac(f, u::AbstractArray, p = nothing, t = nothing; fu = nothing,
end

function _vecjac(f::F, fu, u, autodiff::AutoFiniteDiff) where {F}
cache = (similar(fu), similar(fu))
cache = (similar(fu), similar(fu), similar(fu))
pullback = nothing
return AutoDiffVJP(f, u, cache, autodiff, pullback)
end
Expand Down