Skip to content

BoundsError when joining AnnotatedStrings with distinct label orderings #54860

Closed
@caleb-allen

Description

@caleb-allen

I think I've encountered a bug which occurs when joining AnnotatedStrings with annotations that have not been constructed in the same manner as StyledStrings, specifically with inconsistent ordering of annotation labels between strings.

With simple annotations on two AnnotatedString instances, join works as expected:

julia> import Base: AnnotatedString, annotatedstring, annotations, annotate!

julia> a = AnnotatedString("the quick fox ", [(1:14, :FOO => "bar")])
"the quick fox "

julia> b = AnnotatedString("jumped over the lazy dog", [(1:24, :FOO => "bar")])
"jumped over the lazy dog"

julia> annotations(a * b) # concat only, without joining the annotations
2-element Vector{Tuple{UnitRange{Int64}, Pair{Symbol, Any}}}:
 (1:14, :FOO => "bar")
 (15:38, :FOO => "bar")

julia> annotations(join([a, b]))
1-element Vector{Tuple{UnitRange{Int64}, Pair{Symbol, Any}}}:
 (1:38, :FOO => "bar")

However, if we attempt to join the above string a with an annotated string whose labels are inserted in a different order, it results in a BoundsError:

julia> c = AnnotatedString("jumped over the lazy dog", [(1:5, :BAZ => "bar"), (1:24, :FOO => "bar")])
"jumped over the lazy dog"

julia> annotations(a * c)
3-element Vector{Tuple{UnitRange{Int64}, Pair{Symbol, Any}}}:
 (1:14, :FOO => "bar")
 (15:19, :BAZ => "bar")
 (15:38, :FOO => "bar")

julia> annotations(join([a, c]))
ERROR: BoundsError: attempt to access 1-element Vector{Tuple{UnitRange{Int64}, Pair{Symbol, Any}}} at index [0]
Stacktrace:
  [1] throw_boundserror(A::Vector{Tuple{UnitRange{Int64}, Pair{Symbol, Any}}}, I::Tuple{Int64})
    @ Base ./essentials.jl:14
  [2] getindex
    @ ./essentials.jl:892 [inlined]
  [3] _insert_annotations!(io::Base.AnnotatedIOBuffer, annotations::Vector{Tuple{UnitRange{…}, Pair{…}}}, offset::Int64)
    @ Base ./strings/annotated.jl:600
  [4] _insert_annotations!
    @ ./strings/annotated.jl:591 [inlined]
  [5] write
    @ ./strings/annotated.jl:499 [inlined]
  [6] print
    @ ~/.julia/juliaup/julia-1.11.0-beta2+0.x64.linux.gnu/share/julia/stdlib/v1.11/StyledStrings/src/io.jl:255 [inlined]
  [7] join(io::Base.AnnotatedIOBuffer, iterator::Vector{AnnotatedString{String}}, delim::String)
    @ Base ./strings/io.jl:352
  [8] join
    @ ./strings/io.jl:349 [inlined]
  [9] _join_preserve_annotations(::Vector{AnnotatedString{String}})
    @ Base ./strings/io.jl:359
 [10] join(iterator::Vector{AnnotatedString{String}})
    @ Base ./strings/io.jl:366
 [11] top-level scope
    @ REPL[30]:1
Some type information was truncated. Use `show(err)` to see complete types.

It appears that the BoundsError does not occur if the "joined" annotation is ordered first on both strings (:FOO first for each)

julia> d = AnnotatedString("jumped over the lazy dog", [(1:24, :FOO => "bar"), (1:5, :BAZ => "bar")])
"jumped over the lazy dog"

julia> join([a, d])
"the quick fox jumped over the lazy dog"

julia> join([a, d]) |> annotations
2-element Vector{Tuple{UnitRange{Int64}, Pair{Symbol, Any}}}:
 (1:38, :FOO => "bar")
 (15:19, :BAZ => "bar")

This may be related to #54561 as the stacktrace shows join being dispatched to StyledStrings.

This bug is present on Julia 1.11.0-beta2, installed via juliaup

julia> versioninfo()
Julia Version 1.11.0-beta2
Commit edb3c92d6a6 (2024-05-29 09:37 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 8 × Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, skylake)
Threads: 8 default, 0 interactive, 4 GC (on 8 virtual cores)
Environment:
  JULIA_NUM_THREADS = auto
  JULIA_PKG_USE_CLI_GIT = true
  JULIA_TEST_FAILFAST = true
  JULIA_PKG_PRESERVE_TIERED_INSTALLED = true

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions