Skip to content

x/text/unicode/bidi: Perhaps incorrect implementation of algorithm #71809

Open
@pgundlach

Description

@pgundlach

Go version

go version go1.24.0 darwin/arm64

Output of go env in your module/workspace:

AR='ar'
CC='cc'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='1'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='c++'
GCCGO='gccgo'
GO111MODULE=''
GOARCH='arm64'
GOARM64='v8.0'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/var/folders/md/l2nnr5490tq114003qtxfnk40000gn/T//gocache'
GOCACHEPROG=''
GODEBUG=''
GOENV='/Users/patrick/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/md/l2nnr5490tq114003qtxfnk40000gn/T/go-build220397128=/tmp/go-build -gno-record-gcc-switches -fno-common'
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMOD='/Users/patrick/prog/go/segmentize/go.mod'
GOMODCACHE='/Users/patrick/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='darwin'
GOPATH='/Users/patrick/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/opt/homebrew/Cellar/go/1.24.0/libexec'
GOSUMDB='sum.golang.org'
GOTELEMETRY='local'
GOTELEMETRYDIR='/Users/patrick/Library/Application Support/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/opt/homebrew/Cellar/go/1.24.0/libexec/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.24.0'
GOWORK=''
PKG_CONFIG='pkg-config'

What did you do?

package main

import (
	"fmt"
	"log"

	"golang.org/x/text/unicode/bidi"
)

func dothings() error {
	text := "ع a"
	p := bidi.Paragraph{}
	_, err := p.SetString(text, bidi.DefaultDirection(bidi.LeftToRight))
	if err != nil {
		return err
	}

	order, err := p.Order()
	if err != nil {
		return err
	}
	for v := range order.NumRuns() {
		thisrun := order.Run(v)
		fmt.Println(thisrun.Direction())
		fmt.Printf("~~> thisrun.String() %#v\n", thisrun.String())
	}
	return nil
}

func main() {
	if err := dothings(); err != nil {
		log.Fatal(err)
	}
}

What did you see happen?

1
~~> thisrun.String() "ع "
0
~~> thisrun.String() "a"

What did you expect to see?

1
~~> thisrun.String() "ع"
0
~~> thisrun.String() " a"

Notice the space now belongs to the second output.

See https://util.unicode.org/UnicodeJsps/bidi.jsp?a=ع+a&p=LTR and https://util.unicode.org/UnicodeJsps/bidic.jsp?s=ع+a&b=0&u=140&d=2

The output of these pages are (the first page)

Memory Position 0 1 2
Character ‎ع‎ a
Bidi Class AL WS L
Rules Applied W3R
N2L
Resulting Level
L1 L0 L0

The resulting levels are 1, 0 and 0, so the first run contains the Arabic letter, the second run contains both the space character and the letter a.

Metadata

Metadata

Assignees

No one assigned

    Labels

    NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions