Skip to content

Adding sample for LightGbm ranking #2729

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Feb 26, 2019
Merged

Conversation

najeeb-kazmi
Copy link
Member

Replacing PR #2704 and #2650 as I messed up commit history there.

Fixes #2530
Fixes #776

  • Adds a sample for LightGbm ranking.
  • Cleans up namespaces in Microsoft.ML.Samples project.
  • Addresses feedback from previous PRs

@najeeb-kazmi najeeb-kazmi changed the title 2530 Adding sample for LightGbm ranking Feb 26, 2019
@codecov
Copy link

codecov bot commented Feb 26, 2019

Codecov Report

Merging #2729 into master will decrease coverage by 0.01%.
The diff coverage is 0%.

@@            Coverage Diff             @@
##           master    #2729      +/-   ##
==========================================
- Coverage   71.67%   71.65%   -0.02%     
==========================================
  Files         808      808              
  Lines      142261   142296      +35     
  Branches    16138    16141       +3     
==========================================
- Hits       101960   101959       -1     
- Misses      35861    35898      +37     
+ Partials     4440     4439       -1
Flag Coverage Δ
#Debug 71.65% <0%> (-0.02%) ⬇️
#production 67.89% <0%> (-0.02%) ⬇️
#test 85.86% <ø> (-0.02%) ⬇️
Impacted Files Coverage Δ
...osoft.ML.Data/Evaluators/Metrics/RankingMetrics.cs 90.9% <ø> (ø) ⬆️
src/Microsoft.ML.SamplesUtils/ConsoleUtils.cs 0% <0%> (ø) ⬆️
...c/Microsoft.ML.SamplesUtils/SamplesDatasetUtils.cs 25.19% <0%> (-1.59%) ⬇️
...soft.ML.TestFramework/DataPipe/TestDataPipeBase.cs 73.76% <0%> (-0.33%) ⬇️
...soft.ML.Data/DataLoadSave/Text/TextLoaderCursor.cs 83.26% <0%> (+0.4%) ⬆️

@codecov
Copy link

codecov bot commented Feb 26, 2019

Codecov Report

Merging #2729 into master will decrease coverage by 0.01%.
The diff coverage is 0%.

@@            Coverage Diff             @@
##           master    #2729      +/-   ##
==========================================
- Coverage   71.66%   71.65%   -0.02%     
==========================================
  Files         808      808              
  Lines      142254   142281      +27     
  Branches    16119    16122       +3     
==========================================
- Hits       101951   101950       -1     
- Misses      35866    35893      +27     
- Partials     4437     4438       +1
Flag Coverage Δ
#Debug 71.65% <0%> (-0.02%) ⬇️
#production 67.89% <0%> (-0.02%) ⬇️
#test 85.86% <ø> (ø) ⬆️
Impacted Files Coverage Δ
...osoft.ML.Data/Evaluators/Metrics/RankingMetrics.cs 90.9% <ø> (ø) ⬆️
src/Microsoft.ML.SamplesUtils/ConsoleUtils.cs 0% <0%> (ø) ⬆️
...c/Microsoft.ML.SamplesUtils/SamplesDatasetUtils.cs 25.19% <0%> (-1.59%) ⬇️
...StandardLearners/Standard/LinearModelParameters.cs 60.63% <0%> (-0.27%) ⬇️

@@ -1,8 +1,8 @@
using Microsoft.ML.Transforms.Categorical;

namespace Microsoft.ML.Samples.Dynamic
namespace Microsoft.ML.Samples.Dynamic.Trainers.BinaryClassification
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think it is necessary to organize by namespaces to this granularity.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change was suggested by @shmoradims in an earlier version of this PR. Can we get an agreement on the namespaces?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have trainers with the same name under different catalogs (binary, multi, regression, etc). We either have to use namespaces or class names to differentiate them. E.g:

Microsoft.ML.Samples.Dynamic.LightGbmBinaryClassification vs
Microsoft.ML.Samples.Dynamic.Trainers.BinaryClassification.LightGbm

We also have LightGbm in multiclass and ranking.

I prefer the latter option because class names are identical to the extension method APIs, and namespaces mirror the catalog path.


In reply to: 260413531 [](ancestors = 260413531)


// NOTE:
//
// This sample is currently broken due to a bug in setting the GroupId column in LightGbm when using Options.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setting [](start = 63, length = 7)

this has tendency to get lost.
Can you sign off on this one: #2742?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

@shmoradims shmoradims left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Copy link
Member

@singlis singlis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@najeeb-kazmi najeeb-kazmi merged commit ff6d16d into dotnet:master Feb 26, 2019
@najeeb-kazmi najeeb-kazmi deleted the 2530 branch January 30, 2020 01:18
@ghost ghost locked as resolved and limited conversation to collaborators Mar 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants