Skip to content

Does ML.NET supports Chinese? #325

Closed
@zhengchun

Description

@zhengchun

For example, A Chinese 长春市长春药店 that have many ways to extract text.

Bigram algorithm, its simple and fast.

长春
春市
市长
长春
春药
药店

Standard algorithm.

长春市
长春
药店

I noticed the ML.NET was include a NGramNgramExtractor class that supported N-Gram algorithm, does it support Chinese? The Transforms.TextTransformLanguage includes English,French,German,Dutch,Italian,Spanish,Japanese.

If not, how to implement custom text segmentation for other language? Hope in the future version can support custom extract text feature.

Thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    APIIssues pertaining the friendly APIenhancementNew feature or requestquestionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions