diff --git a/README.md b/README.md index fe8fb9f8..7a8645a7 100644 --- a/README.md +++ b/README.md @@ -21,8 +21,8 @@ Optimisers.jl defines many standard gradient-based optimisation rules, and tools for applying them to deeply nested models. -This is the future of training for [Flux.jl](https://github.com/FluxML/Flux.jl) neural networks, -and the present for [Lux.jl](https://github.com/avik-pal/Lux.jl). +This was written as a new training back-end for [Flux.jl](https://github.com/FluxML/Flux.jl) neural networks, +and is also used by [Lux.jl](https://github.com/avik-pal/Lux.jl). But it can be used separately on any array, or anything else understood by [Functors.jl](https://github.com/FluxML/Functors.jl). ## Installation @@ -38,17 +38,23 @@ It is initialised by `setup`, and then at each step, `update` returns both the n state, and the model with its trainable parameters adjusted: ```julia -state = Optimisers.setup(Optimisers.Adam(), model) # just once +state_tree = Optimisers.setup(Optimisers.Adam(), model) # just once grad = Zygote.gradient(m -> loss(m(x), y), model)[1] -state, model = Optimisers.update(state, model, grad) # at every step +state_tree, model = Optimisers.update(opt_state, model, grad) # at every step ``` For models with deeply nested layers containing the parameters (like [Flux.jl](https://github.com/FluxML/Flux.jl) models), -this state is a similarly nested tree. As is the gradient: if using Zygote, you must use the "explicit" style as shown, +this `state_tree` is a similarly nested object. As is the gradient: if using Zygote, you must use the "explicit" style as shown, not the "implicit" one with `Params`. +You can change the learning rate during training by mutating all the states: + +```julia +Optimisers.adjust!(state_tree, 0.01) +``` + The function `destructure` collects all the trainable parameters into one vector, and returns this along with a function to re-build a similar model: