Skip to content

Performance regression for BitMatrix multiplication in 1.11.2 #56954

@jonocarroll

Description

@jonocarroll

I believe there might be a significant performance regression between 1.11.1 and 1.11.2. I encountered this after upgrading and have managed to pin it down as far as matrix multiplying two large BitMatrix objects.

I found the following (after running the multiplication a couple of times already):

1.11.0:

a = BitMatrix(undef, (3000, 3000));
@time a * a
  0.014537 seconds (3 allocations: 68.665 MiB, 13.85% gc time)

1.11.1:

a = BitMatrix(undef, (3000, 3000));
@time a * a
  0.018001 seconds (3 allocations: 68.665 MiB, 19.69% gc time)

1.11.2:

a = BitMatrix(undef, (3000, 3000));
@time a * a
 11.244051 seconds (3 allocations: 68.665 MiB, 0.02% gc time)

A significant decrease in gc time, but vastly outweighed by the runtime.

I did run some profiling on a fuller example (where I encountered this) and found a large increase associated with a setindex!

1.11.0:

   ╎    ╎    ╎    ╎    ╎    ╎   10  …c/matmul.jl:114; *
   ╎    ╎    ╎    ╎    ╎    ╎    1   …c/matmul.jl:117; matprod_dest
   ╎    ╎    ╎    ╎    ╎    ╎     1   …bitarray.jl:375; similar
   ╎    ╎    ╎    ╎    ╎    ╎    ╎ 1   …ase/boot.jl:599; Array
   ╎    ╎    ╎    ╎    ╎    ╎    ╎  1   …ase/boot.jl:592; Array
   ╎    ╎    ╎    ╎    ╎    ╎    ╎   1   …ase/boot.jl:582; Array
   ╎    ╎    ╎    ╎    ╎    ╎    ╎    1   …ase/boot.jl:535; new_as_memoryref
  1╎    ╎    ╎    ╎    ╎    ╎    ╎     1   …ase/boot.jl:516; GenericMemory
   ╎    ╎    ╎    ╎    ╎    ╎    9   …c/matmul.jl:253; mul!
   ╎    ╎    ╎    ╎    ╎    ╎     9   …c/matmul.jl:285; mul!
   ╎    ╎    ╎    ╎    ╎    ╎    ╎ 9   …c/matmul.jl:287; _mul!
   ╎    ╎    ╎    ╎    ╎    ╎    ╎  9   …c/matmul.jl:868; generic_matmatmul!
   ╎    ╎    ╎    ╎    ╎    ╎    ╎   2   …c/matmul.jl:892; _generic_matmatmul!(…
   ╎    ╎    ╎    ╎    ╎    ╎    ╎    2   …actarray.jl:1312; getindex
   ╎    ╎    ╎    ╎    ╎    ╎    ╎     2   …actarray.jl:1341; _getindex
   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 2   …bitarray.jl:682; getindex
   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  2   …bitarray.jl:676; unsafe_bitgetind…
  2╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   2   …sentials.jl:916; getindex
   ╎    ╎    ╎    ╎    ╎    ╎    ╎   5   …c/matmul.jl:893; _generic_matmatmul!(…
   ╎    ╎    ╎    ╎    ╎    ╎    ╎    5   …simdloop.jl:77; macro expansion
   ╎    ╎    ╎    ╎    ╎    ╎    ╎     5   …c/matmul.jl:894; macro expansion
   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 1   …actarray.jl:1312; getindex
   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  1   …actarray.jl:1341; _getindex
   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   1   …actarray.jl:1347; _to_linear_ind…
   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    1   …actarray.jl:3048; _sub2ind
   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     1   …actarray.jl:98; axes
   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  +1 1   …bitarray.jl:105; size
  1╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  +2 1   …ase/Base.jl:49; getproperty
  4╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 4   …se/array.jl:983; setindex!
   ╎    ╎    ╎    ╎    ╎    ╎    ╎   2   …c/matmul.jl:896; _generic_matmatmul!(…	
  2╎    ╎    ╎    ╎    ╎    ╎    ╎    2   …se/range.jl:908; iterate

1.11.2

     ╎    ╎    ╎    ╎    ╎    ╎   15355 …c/matmul.jl:114; *
     ╎    ╎    ╎    ╎    ╎    ╎    1     …c/matmul.jl:117; matprod_dest
     ╎    ╎    ╎    ╎    ╎    ╎     1     …bitarray.jl:375; similar
     ╎    ╎    ╎    ╎    ╎    ╎    ╎ 1     …ase/boot.jl:599; Array
     ╎    ╎    ╎    ╎    ╎    ╎    ╎  1     …ase/boot.jl:592; Array
     ╎    ╎    ╎    ╎    ╎    ╎    ╎   1     …ase/boot.jl:582; Array
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    1     …ase/boot.jl:535; new_as_memoryref
    1╎    ╎    ╎    ╎    ╎    ╎    ╎     1     …ase/boot.jl:516; GenericMemory
     ╎    ╎    ╎    ╎    ╎    ╎    15354 …c/matmul.jl:253; mul!
     ╎    ╎    ╎    ╎    ╎    ╎     15354 …c/matmul.jl:285; mul!
     ╎    ╎    ╎    ╎    ╎    ╎    ╎ 15354 …c/matmul.jl:287; _mul!
     ╎    ╎    ╎    ╎    ╎    ╎    ╎  15354 …c/matmul.jl:868; generic_matmatmul!
     ╎    ╎    ╎    ╎    ╎    ╎    ╎   15280 …c/matmul.jl:895; _generic_matmatm…
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    15280 …simdloop.jl:77; macro expansion
     ╎    ╎    ╎    ╎    ╎    ╎    ╎     15280 …c/matmul.jl:896; macro expansion
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 535   …actarray.jl:1312; getindex
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  535   …actarray.jl:1341; _getindex
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   404   …actarray.jl:1347; _to_linear…
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    404   …actarray.jl:3048; _sub2ind
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     404   …actarray.jl:98; axes
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  +1 404   …bitarray.jl:105; size
  404╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  +2 404   …ase/Base.jl:49; getproperty
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   131   …bitarray.jl:682; getindex
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    131   …bitarray.jl:676; unsafe_bit…
  131╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     131   …sentials.jl:917; getindex
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 502   …se/array.jl:930; getindex
  502╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  502   …sentials.jl:917; getindex
14243╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 14243 …se/array.jl:994; setindex!
     ╎    ╎    ╎    ╎    ╎    ╎    ╎   74    …c/matmul.jl:898; _generic_matmatm…
   74╎    ╎    ╎    ╎    ╎    ╎    ╎    74    …se/range.jl:908; iterate

The difference doesn't seem to appear for a pair of similarly sized Matrix{Int}

1.11.1:

a = Matrix{Int}(undef, 3000, 3000);
@time a * a'
  7.678765 seconds (4 allocations: 68.665 MiB, 0.45% gc time)

1.11.2:

a = Matrix{Int}(undef, 3000, 3000);
@time a * a'
  7.780284 seconds (4 allocations: 68.665 MiB, 0.49% gc time)

Hopefully someone can reproduce this.

My system:

Platform Info:
  OS: macOS (arm64-apple-darwin24.0.0)
  CPU: 11 × Apple M3 Pro
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, apple-m2)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions