Skip to content

Port SymSGD #556

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 104 commits into from
Closed

Port SymSGD #556

wants to merge 104 commits into from

Conversation

codemzs
Copy link
Member

@codemzs codemzs commented Jul 19, 2018

This change adds parallel SGD trainer.
fixes #623

#include "../Stdafx.h"
#include "mkl.h"
#ifndef COMPILER_GCC
#pragma comment(lib, "../../../Libraries/MKL/Win/Microsoft.ML.MklImports.lib")
Copy link
Contributor

@TomFinley TomFinley Jul 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Libraries/MKL/Win/Microsoft.ML.MklImports.lib [](start = 31, length = 45)

Hi @codemzs thanks for looking at this. We need to either shift this thing to not use MKL, or port the libraries. However, considering that this involves the static linking of the library, and all other usages are using it as a dynamic library, this becomes especially awkward here. #Resolved

Copy link
Contributor

@TomFinley TomFinley Jul 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if temporarily (or not so temporarily) we can shift to using regular ops rather than calling MKL, based on some compile time flag (as we do elsewhere). If on, then it will compile with MKL support. If off, it does it manually. Since these are ops like SAXPYI and whatnot, it's not like they would be incredibly hard to support. #Resolved

Copy link
Member Author

@codemzs codemzs Jul 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @TomFinley , There is a PR out that creates the MKL nuget. I have tested the native symSGD code by linking against the MKL binaries assuming they are present in the packages directory. #Resolved

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Once that PR is in, assuming it works, you would add the nuget dependency in which project?


In reply to: 205300173 [](ancestors = 205300173)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Microsoft.ML.Tests and Microsoft.ML.Predictor.Test. Anything that uses sym sgd would need the reference to the nuget so that binaries are copied over to the project folder in bin directory.


In reply to: 205467167 [](ancestors = 205467167,205300173)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have MKL nuget now.


In reply to: 203788370 [](ancestors = 203788370)

@codemzs codemzs requested a review from eerhardt July 26, 2018 00:27
link_directories(${CMAKE_SOURCE_DIR}/../../packages/MlNetMklDeps/runtimes/win-x64/native)
else()
list(APPEND SOURCES ${VERSION_FILE_PATH})
if(APPLE)
Copy link
Contributor

@TomFinley TomFinley Jul 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[](start = 0, length = 1)

Consistent indentation please. #Resolved

class SymSGD {
private:
int _numFreqFeat;
// Local models that is learned
Copy link
Contributor

@TomFinley TomFinley Jul 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[](start = 0, length = 1)

Please shift these to use spaces for indentation. #Resolved

@@ -27,6 +27,7 @@
<NativeAssemblyReference Include="FastTreeNative" />
<NativeAssemblyReference Include="CpuMathNative" />
<NativeAssemblyReference Include="FactorizationMachineNative" />
<NativeAssemblyReference Include="SymSgdNative" />
Copy link
Contributor

@TomFinley TomFinley Jul 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[](start = 0, length = 1)

Yet another place where I see tabs... #Resolved

@Ivanidzo4ka
Copy link
Contributor

You probably want new version of SymSGD with respectful changes in ITrainer. Which can be found in PR towards internal repo

@codemzs
Copy link
Member Author

codemzs commented Jul 26, 2018

@Ivanidzo4ka This PR consumes the latest ITrainer. The code would have not compiled and tests would have not run otherwise, right? Is there anything specific that you are referring to?


In reply to: 408173274 [](ancestors = 408173274)

@Ivanidzo4ka
Copy link
Contributor

Ivanidzo4ka commented Jul 26, 2018

From what I discover yesterday, you can't just take this code, and update internal repository. SymSgd internally is in separate package, and visibility of LinearTrainerBase don't allow it to be referenced from other projects.
Also from discussion with @TomFinley SymSGD in general shouldn't be based on LinearTrainerBase, but just TrainerBase.


In reply to: 408181017 [](ancestors = 408181017,408173274)

@@ -90,6 +90,8 @@
RelativePath="Microsoft.ML\runtimes\$(PackageRid)\native" />
<NativePackageAsset Include="$(NativeAssetsBuiltPath)\$(NativeLibPrefix)FactorizationMachineNative$(NativeLibExtension)"
RelativePath="Microsoft.ML\runtimes\$(PackageRid)\native" />
<NativePackageAsset Include="$(NativeAssetsBuiltPath)\$(NativeLibPrefix)SymSgdNative$(NativeLibExtension)"
RelativePath="Microsoft.ML\runtimes\$(PackageRid)\native" />
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Jul 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  				   [](start = 86, length = 8)

something, something, tabs #Resolved

// See the LICENSE file in the project root for more information.

#pragma once
#define MAX(__X__, __Y__) (((__X__) < (__Y__)) ? (__Y__) : (__X__))
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Jul 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MAX never get used, do we need it? #Resolved

@@ -0,0 +1,127 @@
#pragma once
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Jul 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

header? #Resolved

@codemzs
Copy link
Member Author

codemzs commented Jul 26, 2018

I don't follow, can you please comment on SymSgdClassificationTrainer.cs and clarify?


In reply to: 408182382 [](ancestors = 408182382,408181017,408173274)

@codemzs codemzs changed the title WIP Port SymSGD Port SymSGD Jul 31, 2018
@@ -7,6 +7,7 @@
</PropertyGroup>

<ItemGroup>
<PackageReference Include="MlNetMklDeps" Version="0.0.0.1" />
Copy link
Member

@sfilipi sfilipi Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[](start = 2, length = 63)

note about having to split SymSGD this as a separate project, rather than adding the dependency here, based on @eerhardt feedback on PR #594

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. We shouldn't be referencing MlNetMklDeps from the Microsoft.ML nuget package.

RuntimeInformation.IsOSPlatform(OSPlatform.OSX))
{
return;
}
using (var env = new TlcEnvironment())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe log an issue about re-enabling.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we disabling existing tests when we are adding new functionality? That seems wrong.

RuntimeInformation.IsOSPlatform(OSPlatform.OSX))
{
return;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same, might be worth logging an issue.


/// <summary>
/// Train a symbolic SGD.
/// </summary>
Copy link
Member

@sfilipi sfilipi Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how much documentation we have about SymSGD. If any, might be nice to include it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

)

if(WIN32)
find_library(MKL_LIBRARY Microsoft.ML.MklImports HINTS ${CMAKE_SOURCE_DIR}/../../packages/mlnetmkldeps/0.0.0.1/runtimes/win-x64/native)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like we discussed offline, we shouldn't be hard-coding these paths and version numbers into our CMake scripts.

find_library(MKL_LIBRARY Microsoft.ML.MklImports.dylib HINTS "${CMAKE_SOURCE_DIR}/../../packages/mlnetmkldeps/0.0.0.1/runtimes/osx-x64/native")
else()
message("Linking SymSgdNative with MKL on linux.")
link_directories(${CMAKE_SOURCE_DIR}/../../packages/mlnetmkldeps/0.0.0.1/runtimes/linux-x64/native)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need both this and the HINTS below? Isn't one of them enough?

@@ -106,6 +106,15 @@ if exist "%__IntermediatesDir%\INSTALL.vcxproj" goto BuildNativeProj
goto :Failure

:BuildNativeProj
echo Copying MKL library in bin folder. This is a temporary fix.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remove this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -93,8 +93,37 @@ __cmake_defines="${__cmake_defines} -DVERSION_FILE_PATH:STRING=${__versionSource

cd "$__IntermediatesDir"

#codemzs: temporary fix until mkl nuget binaries are properly renamed so that they can be consumed by CMAKE.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also remove this. And the one below.

@@ -21,5 +21,10 @@
<NativeAssemblyReference Include="CpuMathNative" />
<NativeAssemblyReference Include="FastTreeNative" />
<NativeAssemblyReference Include="LdaNative" />
<NativeAssemblyReference Include="SymSgdNative" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) whitespace is off.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

</ItemGroup>

<ItemGroup>
<PackageReference Include="MlNetMklDeps" Version="0.0.0.1" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.0.0.1 - this version number should go in 1 place - in the https://github.com/dotnet/machinelearning/blob/master/build/Dependencies.props file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

eerhardt and others added 4 commits August 1, 2018 04:02
* Allow CpuMath to reference C# Hardware Intrinsics APIs.

Need to multi-target CpuMath for netstandard and netcoreapp3.0.  Also, since we are going to move CpuMath into its own NuGet package, remove the dependency from CpuMath to the ML.Core project.

Add a build parameter to enable building against .NET Core 3.0's Runtime Intrinsics APIs.

Fix dotnet#534

* Respond to PR feedback.
* failing test case for multiclass

* Refactored PipelineSweeperSupportedMetrics Class; added unit test for MultiClassClassification; refactored out unit tests for the PipelineSweeper

* take care of review comments; display transforms/learners + metrics in pipeline

* taking care of PR comments + refactor PipelineSweeperRunSummary

* taking care of review comments
* remove domain from onnx operators for non-ML types.

* Make ONNX compatible with Windows RS5 and add more tests.

* PR feedback.

* PR feedback.

* fix build.
* Remove Windows and Linux configurations from netci.groovy

* Add end of line to yml files

* Add badges and change leg name to Linux

* Not merge test results

* Add searchFolder to publish test results task
codemzs and others added 23 commits August 1, 2018 04:02
add proper fields to trainer info for FT, FM, and OnlineLearner regarding incremental training and validation datasets
* Fix warning for L2 in SDCA

* String interpolation

* Word-smithing the warning message
Introduce word embedding transform
Add images support based on System.Drawing
…#611)

* moving Ols to a separate project

* Revert "moving Ols to a separate project"

This reverts commit 9b7eab3.

* separating OLS in its own project

* adding nupkgproj files to create a package for AdditionalLearners
adding AdditionalLearners to the core.tests project
fixing the core_ep and core_manifest

* CSharpApi should not get generated every time.

* Addressing Ivan's comments

* referencing package version 0.0.0.4 of MlNetMklDeps that contains new names for the mkl library.

* Correcting the error message.

* renaming AdditionalLearners to HalLearners < - Hardware Accelerated Learners
removign unsafe from the Hal csproj.
removing the orphaned member section from the doc.xml

* referencing package 0.0.0.5 and updating the name.

* regenerating the CsharpApi and the eplist post merge

* regenerate the CSharpApi and the eplists post merge. Fix the namespace post merge.

* typo

* one shall not space

* spacing
…to symsgd

# Conflicts:
#	src/Microsoft.ML.HalLearners/Microsoft.ML.HalLearners.csproj
#	src/Microsoft.ML.HalLearners/doc.xml
#	src/Microsoft.ML/CSharpApi.cs
#	test/BaselineOutput/Common/EntryPoints/core_manifest.json
@codemzs codemzs mentioned this pull request Aug 1, 2018
@eerhardt
Copy link
Member

eerhardt commented Aug 1, 2018

This can be closed. We merged the other PR

@eerhardt eerhardt closed this Aug 1, 2018
@codemzs codemzs deleted the symsgd branch August 3, 2018 06:03
@ghost ghost locked as resolved and limited conversation to collaborators Mar 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Port SymSGD