Skip to content

Adding a sample for LightGbm Ranking #2650

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 27 commits into from

Conversation

najeeb-kazmi
Copy link
Member

Fixes #2530
Fixes #776

Adding a sample for LightGbm ranking.

@codecov
Copy link

codecov bot commented Feb 20, 2019

Codecov Report

Merging #2650 into master will increase coverage by 0.02%.
The diff coverage is 78.08%.

@@            Coverage Diff             @@
##           master    #2650      +/-   ##
==========================================
+ Coverage   71.54%   71.56%   +0.02%     
==========================================
  Files         800      805       +5     
  Lines      141848   142039     +191     
  Branches    16119    16122       +3     
==========================================
+ Hits       101482   101652     +170     
- Misses      35916    35951      +35     
+ Partials     4450     4436      -14
Flag Coverage Δ
#Debug 71.56% <78.08%> (+0.02%) ⬆️
#production 67.85% <73.59%> (+0.02%) ⬆️
#test 85.73% <92.62%> (ø) ⬆️
Impacted Files Coverage Δ
...L.Transforms/Text/WordHashBagProducingTransform.cs 51.7% <ø> (ø) ⬆️
...osoft.ML.Transforms/Text/TokenizingByCharacters.cs 95.32% <ø> (ø) ⬆️
...rc/Microsoft.ML.Data/EntryPoints/EntryPointNode.cs 68.58% <ø> (-0.27%) ⬇️
src/Microsoft.ML.Core/Data/IHostEnvironment.cs 95.12% <ø> (ø) ⬆️
...c/Microsoft.ML.Transforms/MissingValueReplacing.cs 77.29% <ø> (ø) ⬆️
src/Microsoft.ML.ImageAnalytics/ImageResizer.cs 84.64% <ø> (ø) ⬆️
src/Microsoft.ML.Core/Data/ModelSaving.cs 93.33% <ø> (ø) ⬆️
...soft.ML.Transforms/Text/WrappedTextTransformers.cs 92.64% <ø> (ø) ⬆️
...soft.ML.Transforms/Text/WordEmbeddingsExtractor.cs 86.54% <ø> (ø) ⬆️
src/Microsoft.ML.Data/Transforms/Hashing.cs 91.88% <ø> (ø) ⬆️
... and 194 more

@shmoradims
Copy link

                                FeatureColumn = "Features",

you don't need these. LoadFeaturizedAdultDataset already copies "IsOver50K" to "Label" (default value) so you don't need to specify this.


Refers to: docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/BinaryClassification/LightGBMBinaryClassificationWithOptions.cs:25 in b572614. [](commit_id = b572614, deletion_comment = False)

@shmoradims
Copy link

                                }

please use a few more options. suggestions: EvalMetric, MaxBin,


Refers to: docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/BinaryClassification/LightGBMBinaryClassificationWithOptions.cs:30 in b572614. [](commit_id = b572614, deletion_comment = False)

@shmoradims
Copy link

        var metrics = mlContext.BinaryClassification.Evaluate(dataWithPredictions, "IsOver50K");

no need


Refers to: docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/BinaryClassification/LightGBMBinaryClassificationWithOptions.cs:39 in b572614. [](commit_id = b572614, deletion_comment = False)

@shmoradims
Copy link

        // Positive Recall: 0.67

please use the same format as in

// Expected output:
// results


Refers to: docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/BinaryClassification/LightGBMBinaryClassificationWithOptions.cs:49 in b572614. [](commit_id = b572614, deletion_comment = False)

@shmoradims
Copy link

class LightGbmBinaryClassificationWithOptions

please also add a sample for the api without options.


Refers to: docs/samples/Microsoft.ML.Samples/Dynamic/Trainers/BinaryClassification/LightGBMBinaryClassificationWithOptions.cs:6 in b572614. [](commit_id = b572614, deletion_comment = False)


// Output:
// DCG @N: 1.38, 3.11, 4.94
// NDCG @N: 7.13, 10.12, 12.62

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7.13, 10.12, 12.62 [](start = 24, length = 18)

These numbers look very low to me. i was expecting the values to be 60+.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, looks low to me as well. Do we have a known best solution on this dataset?

We don't need a TryConvert method, we already have a TryParse, which can be used instead.
elbruno and others added 6 commits February 20, 2019 13:37
typo in comment: shoudl changed to should
…et#2651)

* Creation of DataViewSchemaAnnotationsExtensions and movement of public facing metadata methods.
* Internalize MetadataUtils and MetadataBuilder extensions, and revert members to public.
* Internalize MetadataBuilderExtensions.
* Internalization and some commenting for transforms assembly.
* Internalized CheckInputColumn method.
* Internalization of OneToOne and ManyToOne Column classes

* PR feedback.
TomFinley and others added 4 commits February 20, 2019 15:14
* Internalize ModelLoadContext
* Internalize VersionInfo
* Internalize members of ModelSaveContext while keeping the class itself public
* Internalize Repository and subclasses
…alization. (dotnet#2667)

* Microsoft.ML.Internal.Internallearn namespace requires certain internalization.

* PR feedback.

* PR feedback.
* 1st pass. builds work locally

* fixes inside Ensemble, FastTree, StandardLearners, Sweeper

* fixes to Data and Sweeper assemblies
* Reorder MatrixFactorizationTrainer parameters

* nit

* PR feedback
yaeldMS and others added 5 commits February 21, 2019 10:03
* Convert VectorToImageTransform to estimator/transformer

* Rename Schema, Row, etc.

* Code review comments, and fix baseline of EntryPointCatalog test

* Code review comments

* Fix EntryPointCatalog baseline
* Increase build timeout for code coverage CI.

* PR feedback.
* Adding functional tests for explainability
* Stop using System.ComponentModel.Composition

Replace our MEF usage, which is only used by custom mapping transforms, with the ComponentCatalog class.

Fix dotnet#1595
Fix dotnet#2422

* Rename new class to CustomMappingFactory.
wschin and others added 5 commits February 22, 2019 11:22
* Use AllowSparse=false as default in TextLoader

* Update entry point catelog

* Make quote- default

* TextLoader uses TextLoader's default settings

* Address comments

* tab to \t

* Revert a weird change

* Address comments

* Reorder arguments

* Polish cookbook

* Reorder arguments in static TextLoader

* Also change argument order in F#
@ghost ghost locked as resolved and limited conversation to collaborators Mar 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.