Complex Loss Functions Inside TrainingLoop

Thanks to @xihui-wu's talk earlier today, I learned about the [TrainingLoop](https://github.com/tensorflow/swift-models/blob/542a03c09aa7bd6e326a18b5d086f2bec8b4b24c/TrainingLoop/TrainingLoop.swift#L180) struct. I had essentially replicated this functionality in a messier way in my code, so I'm looking at it to see if I could replace my train loop with this cleaner implementation. I believe the only issue I might face is with respect to the loss function. 

The current loss function takes as parameters only the model's output and the target: `public typealias F = @differentiable(Output, @noDerivative Target) -> Tensor<Float>` from [here](https://github.com/tensorflow/swift-models/blob/542a03c09aa7bd6e326a18b5d086f2bec8b4b24c/TrainingLoop/TrainingLoop.swift#L21). This covers a huge majority of supervised training situations, but there are situations where we might want more complicated loss functions. For example, how might we mask the output and the target for each sample when we calculate the loss, as done in [this](https://arxiv.org/pdf/2010.02803.pdf) paper:
> at each time step the model tries to predict the full, uncorrupted input vectors xt; however, only the predictions on the masked values are considered in the Mean Squared Error loss.

Another situation that comes to mind is a loss that requires some third, external set of values. Perhaps this is an RL agent whose current loss is a function of recent past losses. Another example could also be a risk-adjusted metric where "risk" depends on some external value that is not static.

Is my reading of the code correct that these types of loss functions are not currently supported? If so, could the protocol be reasonably modified to optionally support such complex loss functions?

Many thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Complex Loss Functions Inside TrainingLoop #720

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Complex Loss Functions Inside TrainingLoop #720

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions