Skip to content

Sweep Range of L2RegularizerWeight in AveragedPerceptron #579

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 25, 2018

Conversation

SolyarA
Copy link
Contributor

@SolyarA SolyarA commented Jul 24, 2018

Fixes #567

@dnfclas
Copy link

dnfclas commented Jul 24, 2018

CLA assistant check
All CLA requirements met.

Copy link
Contributor

@TomFinley TomFinley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @SolyarA ... this looks OK to me @justinormont is it along the lines of what you wanted in #567?

@justinormont
Copy link
Contributor

@TomFinley, @Zruty0 : What's the implications of missing the range of 0.4 to 0.5 in L2RegularizerWeight for AveragedPerceptron?

A better fix for this would be to add a param to the sweep range to note if the range boundaries are inclusive vs. exclusive.

Copy link
Contributor

@Zruty0 Zruty0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@Zruty0
Copy link
Contributor

Zruty0 commented Jul 25, 2018

I cannot say what the implications are, I don't have an intuition on how the optimizer would behave close to 0.5 boundary.

As for inclusive vs. exclusive boundaries: I believe they were originally included in the code (as Min/Max and also Inf/Lim, or something along these lines). But I assume it was deemed unnecessary complexity and removed? I don't find any trace of this anymore.


In reply to: 407652667 [](ancestors = 407652667)

@TomFinley
Copy link
Contributor

I'm not sure "close" to 0.5 is actually a completely sensible value. See here:

// Weights are scaled down by 2 * L2 regularization on each update step, so 0.5 would scale all weights to 0, which is not sensible.
Contracts.CheckUserArg(0 <= args.L2RegularizerWeight && args.L2RegularizerWeight < 0.5, nameof(args.L2RegularizerWeight), "must be in range [0, 0.5)");

and here:

WeightsScale *= 1 - 2 * Args.L2RegularizerWeight; // L2 regularization.

Indeed I feel like this is all somewhat haphazard, and whoever introduced this sweep range was making the mistake of confusing sweep range with defining valid values... which is not the point at all. Anyway, I'm inclined to just accept @justinormont if that is all right. It seems though like if we are going to have continuous values that the notion of inclusive vs. exclusive bounds needs to be accounted for somehow, not sure why such a concept would be removed. 😦

@TomFinley TomFinley merged commit 7fea0af into dotnet:master Jul 25, 2018
@justinormont
Copy link
Contributor

Thanks for pushing in. And thanks @SolyarA for your PR.

@TomFinley: Agreed; we will want to reduce the range of the sweep params from the valid to the useful ranges. This will speed up the hyperparameter optimization. The only reason I see to keep the ranges as wide as the valid is if we can find examples where the extreme values led to good scores. We have further ideas on how to focus the sweeper's energy towards useful ranges of hyperparameters, so perhaps the work of figuring out the useful ranges won't be needed.

eerhardt pushed a commit to eerhardt/machinelearning that referenced this pull request Jul 27, 2018
* Changed range of L2RegularizerWeight parameter in AveragedPerceptron
codemzs pushed a commit to codemzs/machinelearning that referenced this pull request Aug 1, 2018
* Changed range of L2RegularizerWeight parameter in AveragedPerceptron
@ghost ghost locked as resolved and limited conversation to collaborators Mar 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants