Skip to content

Cleanup calibrators #2601

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 19, 2019
Merged

Conversation

Ivanidzo4ka
Copy link
Contributor

Fixes #2589

@codecov
Copy link

codecov bot commented Feb 18, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@ba75a47). Click here to learn what that means.
The diff coverage is 70.32%.

@@            Coverage Diff            @@
##             master    #2601   +/-   ##
=========================================
  Coverage          ?    71.5%           
=========================================
  Files             ?      800           
  Lines             ?   142059           
  Branches          ?    16148           
=========================================
  Hits              ?   101580           
  Misses            ?    36008           
  Partials          ?     4471
Flag Coverage Δ
#Debug 71.5% <70.32%> (?)
#production 67.79% <70.32%> (?)
#test 85.56% <ø> (?)

@@ -1180,7 +1180,7 @@ private void SaveCore(ModelSaveContext ctx)
ctx.Writer.Write(sizeof(float));
ctx.Writer.Write(BinSize);
ctx.Writer.Write(Min);
ctx.Writer.WriteSingleArray(BinProbs);
ctx.Writer.WriteSingleArray(BinProbs as float[]);
Copy link
Contributor

@TomFinley TomFinley Feb 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as float[] [](start = 48, length = 11)

Let's not please... let's do a backing private field of float[] that we actually operate over. This will have a couple of advantages, most notably not having this bad code smell, but also incidentally making access to this structure much faster since it won't have to work through the virtual method table on every single usage.

Edit: Looked at the rest of the PR, everything else looks great except for this #Resolved

Copy link
Contributor Author

@Ivanidzo4ka Ivanidzo4ka Feb 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make completely clear.
We will have:

public IReadOnlyList<float> BinProbs => _binProbs;
private readonly float[] _binProbs;

And for loading and saving we will use _binProbs
Does it sound good? #Resolved

Copy link
Contributor

@TomFinley TomFinley Feb 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And other operators too, yes. The cast is ugly, but working through the interface on something meant to process every single example will unnecessarily work through the virtual table.

Similar to usual reasons of, say, declaring something as IList if you know it is List, etc. etc. etc. #Resolved

@@ -1205,11 +1207,6 @@ internal static int GetBinIdx(float output, float min, float binSize, int numBin
return binIdx;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh. This code makes my eyes bleed but if no one has complained after all these years we can probably let it stand for a bit. :)

Copy link
Contributor

@TomFinley TomFinley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Ivanidzo4ka !!

@@ -1218,8 +1215,91 @@ public string GetSummary()
[BestFriend]
internal abstract class CalibratorTrainerBase : ICalibratorTrainer
{
public sealed class DataStore : IEnumerable<DataStore.DataItem>
{
public readonly struct DataItem
Copy link
Contributor

@rogancarr rogancarr Feb 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public readonly struct DataItem [](start = 12, length = 31)

Really vague name for a class visible throughout the assembly. #Resolved

Copy link
Contributor

@TomFinley TomFinley Feb 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I see it is a nested class inside the internal CalibratorTrainerBase? So someone would need to have referenced CalibratorTrainerBase to see it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A) it's part of internal class.
B) I'm making it private protected.
C) I'm just moving this code around to hide it, I don't have much intent in improving it's right now


In reply to: 258183106 [](ancestors = 258183106)

@@ -1218,8 +1215,91 @@ public string GetSummary()
[BestFriend]
internal abstract class CalibratorTrainerBase : ICalibratorTrainer
{
public sealed class DataStore : IEnumerable<DataStore.DataItem>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public sealed class DataStore [](start = 8, length = 29)

Really vague name. Can we think of something more precise?

private bool _dataSorted;

public DataStore()
: this(1000000)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1000000 [](start = 23, length = 7)

Any reason?

{
}

public DataStore(int capacity)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public DataStore(int capacity) [](start = 12, length = 30)

Maybe instead of two constructors, we just do int capacity=defaultValue?


_capacity = capacity;
_data = new DataItem[Math.Min(4, capacity)];
// REVIEW: Horrifying. At a point when we have the IHost stuff plumbed through
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Horrifying [](start = 27, length = 10)

The horror


public void AddToStore(float score, bool isPositive, float weight)
{
// Can't calibrate NaN scores.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Can't calibrate NaN scores. [](start = 15, length = 31)

Isn't it BAD to get NaN scores?

@@ -1485,15 +1565,21 @@ private static VersionInfo GetVersionInfo()

private readonly IHost _host;

/// <summary>
/// Slope value for this calibrator.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slope value for this calibrator. [](start = 12, length = 32)

Maybe use the phrases bias and weight instead of Slope and Offset? This is more consistent with our public API.

@@ -1556,14 +1642,15 @@ private void SaveCore(ModelSaveContext ctx)
}
}

/// <summary> Given a classifier output, produce the probability.</summary>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/ <summary [](start = 10, length = 10)

break pls

/// <item><description><see cref="Values"/>[0], if x &lt; <see cref="Mins"/>[0]</description></item>
/// <item><description><see cref="Values"/>[n], if x &gt; <see cref="Maxes"/>[n]</description></item>
///</list>
/// </remarks>
public sealed class PavCalibrator : ICalibrator, ICanSaveInBinaryFormat
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's not too hard, can you provide a link to any resources about this kind of calibration? You could also file an issue and assign it to the docs folks.

@@ -1851,6 +1947,7 @@ private void SaveCore(ModelSaveContext ctx)
_host.CheckDecode(valuePrev <= 1);
}

/// <summary> Given a classifier output, produce the probability.</summary>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[](start = 12, length = 9)

break

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It fits the line, why it need break?


In reply to: 258187676 [](ancestors = 258187676)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the style was to break the xml tags onto their own lines, but I could be mistaken.


In reply to: 258188999 [](ancestors = 258188999,258187676)

Copy link
Contributor

@rogancarr rogancarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@rogancarr
Copy link
Contributor

Just a few nits, although I realize it's nits on copy/pasted code, so feel free to defer those to future maintenance fixes.

@Ivanidzo4ka Ivanidzo4ka merged commit 25404b8 into dotnet:master Feb 19, 2019
@ghost ghost locked as resolved and limited conversation to collaborators Mar 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants