Skip to content

Package CI performance regression #53657

@LilithHafner

Description

@LilithHafner
Member

This NFC PR to Chairmarks apparently made CI on nightly substantially (>2x) slower. I suspect the regression is due to a change in Julia.

Fast (https://github.com/LilithHafner/Chairmarks.jl/actions/runs/8147371039/job/22267929268)

Julia Version 1.12.0-DEV.124
Commit 0311aa4caf (2024-03-04 16:34 UTC)

Slow (https://github.com/LilithHafner/Chairmarks.jl/actions/runs/8174911720/job/22350814940)

Julia Version 1.12.0-DEV.138
Commit 90d84d4ab00 (2024-03-06 00:32 UTC)

Diff
https://github.com/JuliaLang/julia/compare/0311aa4caf..90d84d4ab00

See also: #53439

Activity

Zentrik

Zentrik commented on Mar 9, 2024

@Zentrik
Member

Bisected to fb71a5d

Bisect Script
using Downloads, CodecZlib, Tar, LibGit2

start_sha = "0311aa4caf461d05fb408de67dafc582e2ada9a0"
end_sha = "90d84d4ab00a34cd9c75b06b77925aaf8afbabfd"

bisect_command = raw"""
using Pkg, Test
Pkg.add("Chairmarks")
t = @timed try
    Pkg.test("Chairmarks")
catch
end
t.time
"""

function run_commit(file, commit, os, arch)
    version_full = read(`contrib/commit-name.sh $commit`, String)
    major_minor_version = split(version_full, '.')[1:2] |> x->join(x, '.')

    binary_url = "https://julialangnightlies-s3.julialang.org/bin/$os/$arch/$major_minor_version/julia-$(commit[1:10])-$os-$arch.tar.gz"
    download_path = Downloads.download(binary_url)

    result = mktempdir() do binary_dir
        # extract
        open(download_path) do io
            stream = GzipDecompressorStream(io)
            Tar.extract(stream, binary_dir)
        end

        read(Cmd(
            `$binary_dir/julia-$(commit[1:10])/bin/julia --project=$binary_dir -E include\(\"$(file[1])\"\)`,
            ignorestatus=true
            ), String
        )
    end

    parse(Float64, result)
end

function bisect_perf(bisect_command, start_sha, end_sha, factor=1.5, os="linux", arch="x86_64")
    dir = mktempdir()
    cd(dir)
    repo = LibGit2.clone("https://github.com/JuliaLang/julia.git", dir)

    # Get the start and end commit objects
    start_commit = LibGit2.GitCommit(repo, start_sha)
    end_commit = LibGit2.GitCommit(repo, end_sha)

    # Get the commit range
    commit_range::Vector{String} = LibGit2.with(LibGit2.GitRevWalker(repo)) do walker
        LibGit2.map((oid, repo)->string(oid), walker; range=string(LibGit2.GitHash(start_commit))*".."*string(LibGit2.GitHash(end_commit)), by=LibGit2.Consts.SORT_REVERSE)
    end
    pushfirst!(commit_range, start_sha)

    # Test script, makes it easy to run bisect command
    file = mktemp()
    write(file[1], bisect_command)

    original_time = run_commit(file, start_sha, os, arch)
    println("Starting commit took $original_time")

    left = 1
    right = length(commit_range)
    i = 0
    while left < right
        i = (left + right) ÷ 2
        commit = commit_range[i]

        result = run_commit(file, commit, os, arch)
        
        println("Commit $commit " * (result <= factor * original_time ? "succeeded" : "failed") * " in $result")

        if result <= factor * original_time
            left = i + 1
        else
            right = i
        end
    end

    return left, commit_range[left]
end

bisect_perf(bisect_command, start_sha, end_sha)
LilithHafner

LilithHafner commented on Mar 9, 2024

@LilithHafner
MemberAuthor

Thanks for the bisection! I don't suppose I could tempt you into figuring out which of the PRs to Pkg is responsible? There are only 4 candidates (listed here) and I suspect that JuliaLang/Pkg.jl#3820 is the culprit but it would be nice to know for sure.

IanButterworth

IanButterworth commented on Mar 9, 2024

@IanButterworth
SponsorMember

What is the latest master version tested here? There's been some instability

LilithHafner

LilithHafner commented on Mar 9, 2024

@LilithHafner
MemberAuthor

As of today (LilithHafner/Chairmarks.jl#77)

Julia Version 1.12.0-DEV.154
Commit 53048b2ab1e (2024-03-09 10:44 UTC)

Chairmarks still takes about 2x longer to test on nightly than on 1.10.

IanButterworth

IanButterworth commented on Mar 9, 2024

@IanButterworth
SponsorMember

Would you mind looking into the log line times to see where the main regressions are? Currently I'm looking into Base precompile sysimage/pkgimage coverage regression

KristofferC

KristofferC commented on Mar 9, 2024

@KristofferC
Member

Pkg is not loaded into the process doing the testing and

Test Summary: |   Pass  Broken   Total     Time
Chairmarks    | 632151       7  632158  2m10.5s

shows that it is the testing itself that is slower?

So the only impact I can see from Pkg directly is the precompile files it generates. And the commits in that list touch that. So maybe it has something to do with that. Or something completely different.

LilithHafner

LilithHafner commented on Mar 9, 2024

@LilithHafner
MemberAuthor

In the OP

  • julia-buildpkg went from 5s to 42s
  • cache went from 5s to 17s
  • julia-processcoverage went from 4s to 47s
  • julia-runtest went from 100s to 187s (+87s)
  • testing itself (a subset of julia-runtest) went from 89s to 131s (+42s)

It looks like a combination of most parts of the pipeline getting slower.

Zentrik

Zentrik commented on Mar 9, 2024

@Zentrik
Member

I only bisected the testing itself, so it's possible that the other slowdowns are from some other commit.

I don't suppose I could tempt you into figuring out which of the PRs to Pkg is responsible?

I think to test the different commits I'm supposed to just do add https://github.com/JuliaLang/Pkg.jl/commit/e7d740ac82fa4c289bb58f9acb6e60eebb6d8a8e.
If that's true, the timings I was getting were far too noisy to work out what was going on.

LilithHafner

LilithHafner commented on Mar 9, 2024

@LilithHafner
MemberAuthor

To be honest, I don't know what the best way to bisect this within Pkg.jl is. Rebuilding Julia from source using a different Pkg version would probably do it, but I'm guessing that's a bit involved. It might be best to hand if over to @KristofferC who authored all 4 of those commits.

giordano

giordano commented on Mar 9, 2024

@giordano
Member
IanButterworth

IanButterworth commented on Mar 9, 2024

@IanButterworth
SponsorMember

You don't need to change the UUID anymore, just set --project to the Pkg repo on startup

KristofferC

KristofferC commented on Mar 11, 2024

@KristofferC
Member

The only thing that seemed to have gotten slower here is the "Persistent task" test in Aqua.jl https://github.com/JuliaTesting/Aqua.jl/blob/b51f80035166f716b12a896587a84c89601c86dc/src/persistent_tasks.jl#L2 where it takes 40s on fb71a5d vs 3.3s on 427da5c.

Specifically, the Pkg.precompile in https://github.com/JuliaTesting/Aqua.jl/blob/b51f80035166f716b12a896587a84c89601c86dc/src/persistent_tasks.jl#L103 went from

  1.895197 seconds (3.28 M allocations: 201.884 MiB, 6.62% gc time, 66.50% compilation time: 6% of which was recompilation)

to

 18.610014 seconds (56.49 M allocations: 2.832 GiB, 2.69% gc time, 96.87% compilation time: 1% of which was recompilation)

So from what I can see, the only difference is:

~/JuliaTests/TaskProj
❯ ~/juliaslow/julia  --startup-file=no --project -e 'using Pkg; @time Pkg.precompile()'
  0.579358 seconds (1.93 M allocations: 130.249 MiB, 19.62% gc time, 61.13% compilation time: 10% of which was recompilation)

~/JuliaTests/TaskProj
❯ ~/juliafast/julia  --startup-file=no --project -e 'using Pkg; @time Pkg.precompile()'         
 13.331489 seconds (8.72 M allocations: 461.582 MiB, 0.37% gc time, 98.81% compilation time: <1% of which was recompilation)

which says to me we have to precompile the Base.precompilepkgs function (which I think @IanButterworth started doing in #53682)..

KristofferC

KristofferC commented on Mar 11, 2024

@KristofferC
Member

It might be best to hand if over to @KristofferC who authored all 4 of those commits.

I don't think that is the best attitude towards someone wanting to learn how to do something. My analysis above required no advanced techniques, it was just basic stuff like adding @time and going down deeper into the functions.

KristofferC

KristofferC commented on Mar 11, 2024

@KristofferC
Member

The Aqua slowdown is already fixed on master by JuliaLang/Pkg.jl#3835.

Removing the Aqua test I get on master

Test Summary: |   Pass  Broken   Total   Time
Chairmarks    | 604534       5  604539  29.5s
     Testing Chairmarks tests passed 

and on 427da5c

Test Summary: |   Pass  Broken   Total   Time
Chairmarks    | 535213       5  535218  29.6s
     Testing Chairmarks tests passed 

so the change to Pkg did not affect this so I am "handing this back" now.

IanButterworth

IanButterworth commented on Mar 11, 2024

@IanButterworth
SponsorMember

#53657 (comment) stated the issue remained on 53048b2 which is after JuliaLang/Pkg.jl#3835 was included in master (in a998082)

Maybe there's a more general 1.12 slowdown though, as that comment compared to 1.10, not earlier 1.12 runs

KristofferC

KristofferC commented on Mar 11, 2024

@KristofferC
Member

Locally, for @time Pkg.test() I get a total time of 46s on master vs 44s on 427da5c and 39s on 1.10.

LilithHafner

LilithHafner commented on Mar 11, 2024

@LilithHafner
MemberAuthor

I'm guessing this regression has something to do with precomilation or caching as it is present in CI but not locally.

IanButterworth

IanButterworth commented on Mar 11, 2024

@IanButterworth
SponsorMember

How about local builds vs. juliaup's nightly build? (wondering if it's JULIA_CPU_TARGET not being set properly)

KristofferC

KristofferC commented on Mar 11, 2024

@KristofferC
Member

I'm guessing this regression has something to do with precomilation or caching as it is present in CI but not locally.

Probably not since older cache files are invalid for new julia versions.

KristofferC

KristofferC commented on Apr 16, 2024

@KristofferC
Member

I think the cause of the major issue was the missing Pkg precompilation, the rest is likely a dup of "general" slowdowns as reported in #53511 and #53570 so I am not sure another issue on that is worth having.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    performanceMust go fasterregressionRegression in behavior compared to a previous version

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @giordano@KristofferC@IanButterworth@Zentrik@LilithHafner

        Issue actions

          Package CI performance regression · Issue #53657 · JuliaLang/julia