Skip to content

Commit 24d939a

Browse files
authored
Merge pull request #3 from dotnet/master
Update to origin
2 parents 1421743 + 2107b82 commit 24d939a

File tree

674 files changed

+122293
-36067
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

674 files changed

+122293
-36067
lines changed

.vsts-dotnet-ci.yml

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
phases:
2+
- template: /build/ci/phase-template.yml
3+
parameters:
4+
name: Linux
5+
buildScript: ./build.sh
6+
dockerImage: microsoft/dotnet-buildtools-prereqs:centos-7-b46d863-20180719033416
7+
8+
- template: /build/ci/phase-template.yml
9+
parameters:
10+
name: Windows_NT
11+
buildScript: build.cmd
12+
queue:
13+
name: Hosted VS2017
14+
demands:
15+
- agent.os -equals Windows_NT

Directory.Build.props

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,13 +14,15 @@
1414
<PropertyGroup>
1515
<RestoreSources>
1616
https://api.nuget.org/v3/index.json;
17+
https://dotnet.myget.org/F/dotnet-core/api/v3/index.json;
1718
</RestoreSources>
1819
</PropertyGroup>
1920

2021
<!-- Common repo directories -->
2122
<PropertyGroup>
2223
<RepoRoot>$(MSBuildThisFileDirectory)</RepoRoot>
2324
<SourceDir>$(RepoRoot)src/</SourceDir>
25+
<PkgDir>$(RepoRoot)pkg/</PkgDir>
2426

2527
<!-- Output directories -->
2628
<BinDir Condition="'$(BinDir)'==''">$(RepoRoot)bin/</BinDir>

Directory.Build.targets

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,5 +5,33 @@
55
Text="The tools directory [$(ToolsDir)] does not exist. Please run build in the root of the repo to ensure the tools are installed before attempting to build an individual project." />
66
</Target>
77

8+
<Target Name="CopyNativeAssemblies"
9+
BeforeTargets="PrepareForRun">
10+
11+
<PropertyGroup>
12+
<LibPrefix Condition="'$(OS)' != 'Windows_NT'">lib</LibPrefix>
13+
<LibExtension Condition="'$(OS)' == 'Windows_NT'">.dll</LibExtension>
14+
<LibExtension Condition="'$(OS)' != 'Windows_NT'">.so</LibExtension>
15+
<LibExtension Condition="$([MSBuild]::IsOSPlatform('osx'))">.dylib</LibExtension>
16+
</PropertyGroup>
17+
18+
<ItemGroup>
19+
<NativeAssemblyReference>
20+
<FullAssemblyPath>$(NativeOutputPath)$(LibPrefix)%(NativeAssemblyReference.Identity)$(LibExtension)</FullAssemblyPath>
21+
</NativeAssemblyReference>
22+
</ItemGroup>
23+
24+
<Copy SourceFiles = "@(NativeAssemblyReference->'%(FullAssemblyPath)')"
25+
DestinationFolder="$(OutputPath)"
26+
OverwriteReadOnlyFiles="$(OverwriteReadOnlyFiles)"
27+
Retries="$(CopyRetryCount)"
28+
RetryDelayMilliseconds="$(CopyRetryDelayMilliseconds)"
29+
UseHardlinksIfPossible="$(CreateHardLinksForPublishFilesIfPossible)"
30+
UseSymboliclinksIfPossible="$(CreateSymbolicLinksForPublishFilesIfPossible)">
31+
<Output TaskParameter="DestinationFiles" ItemName="FileWrites"/>
32+
</Copy>
33+
34+
</Target>
35+
836
<Import Project="$(ToolsDir)/versioning.targets" Condition="Exists('$(ToolsDir)/versioning.targets')" />
937
</Project>

Microsoft.ML.sln

Lines changed: 172 additions & 48 deletions
Large diffs are not rendered by default.

README.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ Along with these ML capabilities this first release of ML.NET also brings the fi
1818

1919
ML.NET runs on Windows, Linux, and macOS - any platform where 64 bit [.NET Core](https://github.com/dotnet/core) or later is available.
2020

21-
The current release is 0.2. Check out the [release notes](docs/release-notes/0.2/release-0.2.md).
21+
The current release is 0.3. Check out the [release notes](docs/release-notes/0.3/release-0.3.md).
2222

2323
First ensure you have installed [.NET Core 2.0](https://www.microsoft.com/net/learn/get-started) or later. ML.NET also works on the .NET Framework. Note that ML.NET currently must run in a 64 bit process.
2424

@@ -44,9 +44,9 @@ To build ML.NET from source please visit our [developers guide](docs/project-doc
4444

4545
| | x64 Debug | x64 Release |
4646
|:---|----------------:|------------------:|
47-
|**Linux**|[![x64-debug](https://ci2.dot.net/job/dotnet_machinelearning/job/master/job/linux_debug/badge/icon)](https://ci2.dot.net/job/dotnet_machinelearning/job/master/job/linux_debug/lastCompletedBuild)|[![x64-release](https://ci2.dot.net/job/dotnet_machinelearning/job/master/job/linux_release/badge/icon)](https://ci2.dot.net/job/dotnet_machinelearning/job/master/job/linux_release/lastCompletedBuild)|
47+
|**Linux**|[![x64-debug](https://dotnet.visualstudio.com/public/_apis/build/status/104?branch=master)](https://dotnet.visualstudio.com/DotNet-Public/_build/latest?definitionId=104&branch=master)|[![x64-release](https://dotnet.visualstudio.com/public/_apis/build/status/104?branch=master)](https://dotnet.visualstudio.com/DotNet-Public/_build/latest?definitionId=104&branch=master)|
4848
|**macOS**|[![x64-debug](https://ci2.dot.net/job/dotnet_machinelearning/job/master/job/osx10.13_debug/badge/icon)](https://ci2.dot.net/job/dotnet_machinelearning/job/master/job/osx10.13_debug/lastCompletedBuild)|[![x64-release](https://ci2.dot.net/job/dotnet_machinelearning/job/master/job/osx10.13_release/badge/icon)](https://ci2.dot.net/job/dotnet_machinelearning/job/master/job/osx10.13_release/lastCompletedBuild)|
49-
|**Windows**|[![x64-debug](https://ci2.dot.net/job/dotnet_machinelearning/job/master/job/windows_nt_debug/badge/icon)](https://ci2.dot.net/job/dotnet_machinelearning/job/master/job/windows_nt_debug/lastCompletedBuild)|[![x64-release](https://ci2.dot.net/job/dotnet_machinelearning/job/master/job/windows_nt_release/badge/icon)](https://ci2.dot.net/job/dotnet_machinelearning/job/master/job/windows_nt_release/lastCompletedBuild)|
49+
|**Windows**|[![x64-debug](https://dotnet.visualstudio.com/public/_apis/build/status/104?branch=master)](https://dotnet.visualstudio.com/DotNet-Public/_build/latest?definitionId=104&branch=master)|[![x64-release](https://dotnet.visualstudio.com/public/_apis/build/status/104?branch=master)](https://dotnet.visualstudio.com/DotNet-Public/_build/latest?definitionId=104&branch=master)|
5050

5151
## Contributing
5252

@@ -84,6 +84,9 @@ SentimentPrediction prediction = model.Predict(data);
8484

8585
Console.WriteLine("prediction: " + prediction.Sentiment);
8686
```
87+
## Samples
88+
89+
We have a [repo of samples](https://github.com/dotnet/machinelearning-samples) that you can look at.
8790

8891
## License
8992

build/BranchInfo.props

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
<Project>
22
<PropertyGroup>
33
<MajorVersion>0</MajorVersion>
4-
<MinorVersion>3</MinorVersion>
4+
<MinorVersion>4</MinorVersion>
55
<PatchVersion>0</PatchVersion>
66
<PreReleaseLabel>preview</PreReleaseLabel>
77
</PropertyGroup>

build/Dependencies.props

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
<SystemThreadingTasksDataflowPackageVersion>4.8.0</SystemThreadingTasksDataflowPackageVersion>
77
<SystemCodeDomPackageVersion>4.4.0</SystemCodeDomPackageVersion>
88
<SystemReflectionEmitLightweightPackageVersion>4.3.0</SystemReflectionEmitLightweightPackageVersion>
9-
<SystemValueTupleVersion>4.4.0</SystemValueTupleVersion>
109
<PublishSymbolsPackageVersion>1.0.0-beta-62824-02</PublishSymbolsPackageVersion>
10+
<LightGBMPackageVersion>2.1.2.2</LightGBMPackageVersion>
1111
</PropertyGroup>
1212
</Project>

build/ci/phase-template.yml

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
parameters:
2+
name: ''
3+
buildScript: ''
4+
dockerImage: ''
5+
queue: {}
6+
7+
phases:
8+
- phase: ${{ parameters.name }}
9+
variables:
10+
_buildScript: ${{ parameters.buildScript }}
11+
_phaseName: ${{ parameters.name }}
12+
# if dockerImage is not equal to '' then run under docker container
13+
${{ if ne(parameters.dockerImage, '') }}:
14+
_PREVIEW_VSTS_DOCKER_IMAGE: ${{ parameters.dockerImage }}
15+
queue:
16+
parallel: 2
17+
matrix:
18+
Build_Debug:
19+
_configuration: Debug
20+
Build_Release:
21+
_configuration: Release
22+
${{ insert }}: ${{ parameters.queue }}
23+
steps:
24+
- script: $(_buildScript) -$(_configuration) -runtests
25+
displayName: Build and Test
26+
- task: PublishTestResults@2
27+
displayName: Publish Test Results
28+
condition: succeededOrFailed()
29+
inputs:
30+
testRunner: 'vSTest'
31+
searchFolder: '$(System.DefaultWorkingDirectory)/bin'
32+
testResultsFiles: '**/*.trx'
33+
testRunTitle: Machinelearning_Tests_$(_phaseName)_$(_configuration)_$(Build.BuildNumber)
34+
configuration: $(_configuration)
35+
mergeTestResults: true
36+
- script: $(_buildScript) -buildPackages
37+
displayName: Build Packages

docs/building/unix-instructions.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,9 +42,11 @@ macOS 10.12 or higher is needed to build dotnet/machinelearning.
4242

4343
On macOS a few components are needed which are not provided by a default developer setup:
4444
* cmake 3.10.3
45+
* gcc
4546
* All the requirements necessary to run .NET Core 2.0 applications. To view macOS prerequisites click [here](https://docs.microsoft.com/en-us/dotnet/core/macos-prerequisites?tabs=netcore2x).
4647

47-
One way of obtaining CMake is via [Homebrew](http://brew.sh):
48+
One way of obtaining CMake and gcc is via [Homebrew](http://brew.sh):
4849
```sh
4950
$ brew install cmake
51+
$ brew install gcc
5052
```

docs/code/EntryPoints.md

Lines changed: 231 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,231 @@
1+
# Entry Points And Helper Classes
2+
3+
## Overview
4+
5+
Entry points are a way to interface with ML.NET components, by specifying an execution graph of connected inputs and outputs of those components.
6+
Both the manifest describing available components and their inputs/outputs, and an "experiment" graph description, are expressed in JSON.
7+
The recommended way of interacting with ML.NET through other, non-.NET programming languages, is by composing, and exchanging pipelines or experiment graphs.
8+
9+
Through the documentation, we also refer to entry points as 'entry points nodes', and that is because they are the nodes of the graph representing the experiment.
10+
The graph 'variables', the various values of the experiment graph JSON properties, serve to describe the relationship between the entry point nodes.
11+
The 'variables' are therefore the edges of the DAG (Directed Acyclic Graph).
12+
13+
All of ML.NET entry points are described by their manifest. The manifest is another JSON object that documents and describes the structure of an entry points.
14+
Manifests are referenced to understand what an entry point does, and how it should be constructed, in a graph.
15+
16+
This document briefly describes the structure of the entry points, the structure of an entry point manifest, and mentions the ML.NET classes that help construct an entry point graph.
17+
18+
## EntryPoint manifest - the definition of an entry point
19+
20+
The components manifest is build by scanning the ML.NET assemblies through reflection and searching for types having the: `SignatureEntryPointModule` signature in their `LoadableClass` assembly attribute definition.
21+
An example of an entry point manifest object, specifically for the `ColumnTypeConverter` transform, is:
22+
23+
```javascript
24+
{
25+
"Name": "Transforms.ColumnTypeConverter",
26+
"Desc": "Converts a column to a different type, using standard conversions.",
27+
"FriendlyName": "Convert Transform",
28+
"ShortName": "Convert",
29+
"Inputs": [
30+
{ "Name": "Column",
31+
"Type": {
32+
"Kind": "Array",
33+
"ItemType": {
34+
"Kind": "Struct",
35+
"Fields": [
36+
{
37+
"Name": "ResultType",
38+
"Type": {
39+
"Kind": "Enum",
40+
"Values": [ "I1","I2","U2","I4","U4","I8","U8","R4","Num","R8","TX","Text","TXT","BL","Bool","TimeSpan","TS","DT","DateTime","DZ","DateTimeZone","UG","U16" ]
41+
},
42+
"Desc": "The result type",
43+
"Aliases": [ "type" ],
44+
"Required": false,
45+
"SortOrder": 150,
46+
"IsNullable": true,
47+
"Default": null
48+
},
49+
{ "Name": "Range",
50+
"Type": "String",
51+
"Desc": "For a key column, this defines the range of values",
52+
"Aliases": [ "key" ],
53+
"Required": false,
54+
"SortOrder": 150,
55+
"IsNullable": false,
56+
"Default": null
57+
},
58+
{ "Name": "Name",
59+
"Type": "String",
60+
"Desc": "Name of the new column",
61+
"Aliases": [ "name" ],
62+
"Required": false,
63+
"SortOrder": 150,
64+
"IsNullable": false,
65+
"Default": null
66+
},
67+
{ "Name": "Source",
68+
"Type": "String",
69+
"Desc": "Name of the source column",
70+
"Aliases": [ "src" ],
71+
"Required": false,
72+
"SortOrder": 150,
73+
"IsNullable": false,
74+
"Default": null
75+
}
76+
]
77+
}
78+
},
79+
"Desc": "New column definition(s) (optional form: name:type:src)",
80+
"Aliases": [ "col" ],
81+
"Required": true,
82+
"SortOrder": 1,
83+
"IsNullable": false
84+
},
85+
{ "Name": "Data",
86+
"Type": "DataView",
87+
"Desc": "Input dataset",
88+
"Required": true,
89+
"SortOrder": 2,
90+
"IsNullable": false
91+
},
92+
{ "Name": "ResultType",
93+
"Type": {
94+
"Kind": "Enum",
95+
"Values": [ "I1","I2","U2","I4","U4","I8","U8","R4","Num","R8","TX","Text","TXT","BL","Bool","TimeSpan","TS","DT","DateTime","DZ","DateTimeZone","UG","U16" ]
96+
},
97+
"Desc": "The result type",
98+
"Aliases": [ "type" ],
99+
"Required": false,
100+
"SortOrder": 2,
101+
"IsNullable": true,
102+
"Default": null
103+
},
104+
{ "Name": "Range",
105+
"Type": "String",
106+
"Desc": "For a key column, this defines the range of values",
107+
"Aliases": [ "key" ],
108+
"Required": false,
109+
"SortOrder": 150,
110+
"IsNullable": false,
111+
"Default": null
112+
}
113+
],
114+
"Outputs": [
115+
{
116+
"Name": "OutputData",
117+
"Type": "DataView",
118+
"Desc": "Transformed dataset"
119+
},
120+
{
121+
"Name": "Model",
122+
"Type": "TransformModel",
123+
"Desc": "Transform model"
124+
}
125+
],
126+
"InputKind": ["ITransformInput" ],
127+
"OutputKind": [ "ITransformOutput" ]
128+
}
129+
```
130+
131+
The respective entry point, constructed based on this manifest would be:
132+
133+
```javascript
134+
{
135+
"Name": "Transforms.ColumnTypeConverter",
136+
"Inputs": {
137+
"Column": [{
138+
"Name": "Features",
139+
"Source": "Features"
140+
}],
141+
"Data": "$data0",
142+
"ResultType": "R4"
143+
},
144+
"Outputs": {
145+
"OutputData": "$Convert_Output",
146+
"Model": "$Convert_TransformModel"
147+
}
148+
}
149+
```
150+
151+
## `EntryPointGraph`
152+
153+
This class encapsulates the list of nodes (`EntryPointNode`) and edges
154+
(`EntryPointVariable` inside a `RunContext`) of the graph.
155+
156+
## `EntryPointNode`
157+
158+
This class represents a node in the graph, and wraps an entry point call. It
159+
has methods for creating and running entry points. It also has a reference to
160+
the `RunContext` to allow it to get and set values from `EntryPointVariable`s.
161+
162+
To express the inputs that are set through variables, a set of dictionaries
163+
are used. The `InputBindingMap` maps an input parameter name to a list of
164+
`ParameterBinding`s. The `InputMap` maps a `ParameterBinding` to a
165+
`VariableBinding`. For example, if the JSON looks like this:
166+
167+
```javascript
168+
'foo': '$bar'
169+
```
170+
171+
the `InputBindingMap` will have one entry that maps the string "foo" to a list
172+
that has only one element, a `SimpleParameterBinding` with the name "foo" and
173+
the `InputMap` will map the `SimpleParameterBinding` to a
174+
`SimpleVariableBinding` with the name "bar". For a more complicated example,
175+
let's say we have this JSON:
176+
177+
```javascript
178+
'foo': [ '$bar[3]', '$baz']
179+
```
180+
181+
the `InputBindingMap` will have one entry that maps the string "foo" to a list
182+
that has two elements, an `ArrayIndexParameterBinding` with the name "foo" and
183+
index 0 and another one with index 1. The `InputMap` will map the first
184+
`ArrayIndexParameterBinding` to an `ArrayIndexVariableBinding` with name "bar"
185+
and index 3 and the second `ArrayIndexParameterBinding` to a
186+
`SimpleVariableBinding` with the name "baz".
187+
188+
For outputs, a node assumes that an output is mapped to a variable, so the
189+
`OutputMap` is a simple dictionary from string to string.
190+
191+
## `EntryPointVariable`
192+
193+
This class represents an edge in the entry point graph. It has a name, a type
194+
and a value. Variables can be simple, arrays and/or dictionaries. Currently,
195+
only data views, file handles, predictor models and transform models are
196+
allowed as element types for a variable.
197+
198+
## `RunContext`
199+
200+
This class is just a container for all the variables in a graph.
201+
202+
## `VariableBinding` and Derived Classes
203+
204+
The abstract base class represents a "pointer to a (part of a) variable". It
205+
is used in conjunction with `ParameterBinding`s to specify inputs to an entry
206+
point node. The `SimpleVariableBinding` is a pointer to an entire variable,
207+
the `ArrayIndexVariableBinding` is a pointer to a specific index in an array
208+
variable, and the `DictionaryKeyVariableBinding` is a pointer to a specific
209+
key in a dictionary variable.
210+
211+
## `ParameterBinding` and Derived Classes
212+
213+
The abstract base class represents a "pointer to a (part of a) parameter". It
214+
parallels the `VariableBinding` hierarchy and it is used to specify the inputs
215+
to an entry point node. The `SimpleParameterBinding` is a pointer to a
216+
non-array, non-dictionary parameter, the `ArrayIndexParameterBinding` is a
217+
pointer to a specific index of an array parameter and the
218+
`DictionaryKeyParameterBinding` is a pointer to a specific key of a dictionary
219+
parameter.
220+
221+
## How to create an entry point for an existing ML.NET component
222+
223+
The steps to take, to create an entry point for an existing ML.NET component, are:
224+
1. Add the `SignatureEntryPointModule` signature to the `LoadableClass` assembly attribute.
225+
2. Create a public static method, that:
226+
a. Takes as input, among others, an object representing the arguments of the component you want to expose.
227+
b. Initializes and run the components, returning one of the nested classes of `Microsoft.ML.Runtime.EntryPoints.CommonOutputs`
228+
c. Is annotated with the `TlcModule.EntryPoint` attribute
229+
230+
Based on the type of entry point being created, there are further conventions on the name of the method, for example, the Trainers entry points are typically called: 'TrainMultiClass', 'TrainBinary' etc, based on the task.
231+
Look at [OnlineGradientDescent](../../src/Microsoft.ML.StandardLearners/Standard/Online/OnlineGradientDescent.cs) for an example of a component and its entry point.

0 commit comments

Comments
 (0)