Does ML.NET supports Chinese?

For example, A Chinese `长春市长春药店`  that have many ways to extract text.

Bigram algorithm, its simple and fast.
```
长春
春市
市长
长春
春药
药店
```

Standard algorithm.
```
长春市
长春
药店
```

I noticed the ML.NET was include a `NGramNgramExtractor` class that supported N-Gram algorithm, does it  support Chinese? The `Transforms.TextTransformLanguage` includes `English,French,German,Dutch,Italian,Spanish,Japanese`.

If not, how to implement custom text segmentation for other language? Hope in the future version can support custom extract text feature.

Thanks.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Does ML.NET supports Chinese? #325

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Does ML.NET supports Chinese? #325

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions