Skip to content

[API Proposal]: ReadOnlySpan<char> overloads for StringNormalizationExtensions #87757

@udaken

Description

@udaken

Edited by @tarekgh

Background and motivation

If we try to verify that the string represented by ReadOnlySpan<char> is Unicode normalized, we currently have to turn it into a string once.
Overloading for ReadOnlySpan<char> is needed to reduce allocation.

API Proposal

namespace System;

// Doesn't throw on the operation even when having wrong character sequence. 

public static class StringNormalizationExtensions {
+    public static bool IsNormalized(this ReadOnlySpan<char> source, NormalizationForm normalizationForm = NormalizationForm.FormC); // will return false in invalid sequence
+    public static OperationStatus TryNormalize(this ReadOnlySpan<char> source, Span<char> destination, out int charsWritten, NormalizationForm normalizationForm = NormalizationForm.FormC);
}

Original Proposal - Alternative Proposal

namespace System;

// throws on wrong sequence as String.Normalize does. 

public static class StringNormalizationExtensions {
+    public static bool IsNormalized(this ReadOnlySpan<char> strInput, NormalizationForm normalizationForm = NormalizationForm.FormC);
+    public static bool TryNormalize(this ReadOnlySpan<char> strInput, Span<char> destination, out int charsWritten, NormalizationForm normalizationForm = NormalizationForm.FormC);
}

API Usage

ReadOnlySpan<char> partOfhugeArray =  hugeArray.AsSpan(start, length);
if (!partOfhugeArray.IsNormalized())
    throw new Exception("Text must be Unicode normalized.");
/// ...

No response

Risks

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    api-approvedAPI was approved in API review, it can be implementedarea-System.Globalizationin-prThere is an active PR which will close this issue when it is merged

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions