Fix construction of conflict sets once and for all (WIP) #3357

edsko · 2016-04-17T15:36:02Z

At @kosmikus ' request I have started a new solver branch in the main repository so that we can push solver fixes to this branch and then once we are happy with them, push them to master.

I realized that my understanding of conflict sets and goal reason chains was flawed. I think it's not just me though; they badly need explaining and documenting. To this end I started writing a blog post about the relation between conflict sets, pre assignments, constrained instances, and goal reason chains; a first draft can be found at http://www.well-typed.com/blog/preview/backjumping/ .

On this branch so far we first have @grayjay 's test case for backjumping after the SIR check. We previously fixed this in #3221, but I now realize the fix there was incorrect. (In my defence, I did mention I was not very convinced.) I hope it is correct now (and if so, this PR will obsolete #3221). In addition we have a fix to the cycle check detection (#3170). That check too was wrong, and for similar reasons. The correct solution is also similar in both cases. Instead of collecting which packages are involved in the conflict, we instead collect Goals (packages + goal reason chains). Then to construct the conflict set we construct conflict sets for each of these individual goals and then merge them. This is just a generalization of the process that happens in during validation (described in the aforelinked blog post). For the cycle check this is a tad awkward since we start with the final reverse dependency map which does not include goal reason chains, but we can just record these separately.

Sadly I am running out of time now. I was hoping to get more done today but I cannot. However, I think what needs to happen in the linker is similar (#3327) : instead of trying to construct a separate lgBlame set, we instead should collect Goals as members of the link group, and use those to construct the conflict set. separate lgBlame is not necessary (I hope); though perhaps @grayjay will correct me on that. Note that #3268 is (mostly) independent from this, as that one is less about constructing conflict set as it is about constructing the right link groups in the first place. That said, that PR should probably also be rebased to this branch (or at least, my version of it at https://github.com/edsko/cabal/tree/pr/FixLinkDeps ; still on my TODO list).

It is somewhat frustrating that this is so difficult, but this is a subtle topic and I think fixing these issues and documenting and writing this up is rather important.

@grayjay can I ask you for a favour? Can you rebase all your evil solver unit tests that are still missing from master and add them to this solver branch, even if they are currently failing? That would be very helpful.

@kosmikus

This commit is a pure refactoring, no semantic changes. Submitting as separate PR at @kosmikus request. Tihs moves `ConflictSet` and `Var` into their own modules.

@kosmikus

In the original implemented we recorded only the two conflicting packages; in the "fix" in #3221 we include the GRC for one of the two packages, but not the other. This is also wrong, although slightly less so and @kosmikus tells me that it's actually rather difficult to find an example that exposes the remaining inaccuracy. I'll have to take him at his word on that :) In this commit we use the GRC for both goals instead.

The cycle check was not constructing the correct conflict sets; when we found a cycle, we constructed a conflict set consisting of all choices that we made so far, as long as those choices directly involved a package in the cycle. But sometimes the dependencies are more subtle than that. Now instead we construct conflict sets from all packages _and their goal reason chains_ that were involved in the cycle and then union all those.

grayjay · 2016-04-18T00:28:47Z

Will this branch be merged into 1.24, if there's enough time?

@edsko I'll cherry-pick my tests and your comments on the tests. I can also rebase pr/FixLinkDeps.

Instead of collecting which packages are involved in the conflict, we instead collect Goals (packages + goal reason chains).

Does this change apply to all conflict sets? Will the new ConflictSet type only allow insertion of Goals?

separate lgBlame is not necessary (I hope); though perhaps @grayjay will correct me on that.

Here's the reason that I thought we needed lgBlame. I think it is possible for packages to become linked for reasons other than the actual linking choice or the introduction of the packages' goals. lgBlame stores those other variables so that they can be added to the conflict set if the link group ever leads to a failure.

Say A depends on B and C, and introduces both goals. B also depends on C, but only when flagB is true. For some dependency problem, the solver has to choose 0.B and 1.B. If it links 0.B and 1.B and tries +flagB, it will have to link 0.C and 1.C. lgBlame for 0.C and 1.C's link group will contain 0.flagB. If linking 1.C leads to a failure, the solver should try -flagB. Without lgBlame, the conflict set would only include 0.C, 1.C, 0.A, and 1.A (as well as any packages that introduced 0.A or 1.A).

edsko · 2016-04-18T03:11:39Z

Will this branch be merged into 1.24, if there's enough time?

I really hope so.

@edsko I'll cherry-pick my tests and your comments on the tests. I can also rebase pr/FixLinkDeps.

If you can rebase pr/FixLinkDeps and push it to this branch, that would be awesome. I am incredibly strapped for time at the moment.

Instead of collecting which packages are involved in the conflict, we instead collect Goals (packages + goal reason chains).

Does this change apply to all conflict sets? Will the new ConflictSet type only allow insertion of Goals?

I think it would make sense to say that ConflictSets can only be constructed from two operations: from a Goal, or by merging other ConflictSets. If that is not the case, then the exceptions would be quite illustrative. But perhaps @kosmikus can comment here as well.

separate lgBlame is not necessary (I hope); though perhaps @grayjay will correct me on that.

Here's the reason that I thought we needed lgBlame. I think it is possible for packages to become linked for reasons other than the actual linking choice or the introduction of the packages' goals. lgBlame stores those other variables so that they can be added to the conflict set if the link group ever leads to a failure.

Say A depends on B and C, and introduces both goals. B also depends on C, but only when flagB is true. For some dependency problem, the solver has to choose 0.B and 1.B. If it links 0.B and 1.B and tries +flagB, it will have to link 0.C and 1.C. lgBlame for 0.C and 1.C's link group will contain 0.flagB. If linking 1.C leads to a failure, the solver should try -flagB. Without lgBlame, the conflict set would only include 0.C, 1.C, 0.A, and 1.A (as well as any packages that introduced 0.A or 1.A).

Yes, it is important that flagB will be included in the conflict set here. However, in your example, the dependency on C will have a goal reason chain that includes flagB; so, we if base this on Goals (packages + goal reason chains), I think this will be okay? I could be wrong though.

kosmikus · 2016-04-18T18:37:32Z

Instead of collecting which packages are involved in the conflict, we instead collect Goals (packages + goal reason chains).

Does this change apply to all conflict sets? Will the new ConflictSet type only allow insertion of Goals?

I think it would make sense to say that ConflictSets can only be constructed from two operations: > from a Goal, or by merging other ConflictSets. If that is not the case, then the exceptions would be quite illustrative. But perhaps @kosmikus can comment here as well.

Yes, I think that makes sense. If we need other ways to create ConflictSets for whatever reason, it still makes sense to explicitly add them to the ConflictSet module now that the type is abstract.

separate lgBlame is not necessary (I hope); though perhaps @grayjay will correct me on that.

Here's the reason that I thought we needed lgBlame. I think it is possible for packages to become linked for reasons other than the actual linking choice or the introduction of the packages' goals. lgBlame stores those other variables so that they can be added to the conflict set if the link group ever leads to a failure.

Say A depends on B and C, and introduces both goals. B also depends on C, but only when flagB is true. For some dependency problem, the solver has to choose 0.B and 1.B. If it links 0.B and 1.B and tries +flagB, it will have to link 0.C and 1.C. lgBlame for 0.C and 1.C's link group will contain 0.flagB. If linking 1.C leads to a failure, the solver should try -flagB. Without lgBlame, the conflict set would only include 0.C, 1.C, 0.A, and 1.A (as well as any packages that introduced 0.A or 1.A).

Yes, it is important that flagB will be included in the conflict set here. However, in your example, the dependency on C will have a goal reason chain that includes flagB; so, we if base this on Goals (packages + goal reason chains), I think this will be okay? I could be wrong though.

Hmm, I'm not sure here, because I'm still not completely confident about some aspects of linking. Normally, if A already depends on C, then it's perfectly possible for C to get GRC A only. If it's later discovered that B also depends on C under certain circumstances, then that GRC would not be updated. In the unlinked scenario, this is ok. If we figure out that there's a conflict involving C, then we do not have a chance to avoid the C goal if we not at least change the choice for A.

With linking, the situation is a bit more complicated. If we link 0.B and 1.B (but not 0.A and 1.A if they both exist), then as @grayjay says, it depends on the choice of flagB if we a forced to link 0.C and 1.C or not. Once again, without looking at the code, the intuition here is the constraint that 0.C and 1.C have to be linked is introduced by the choice of flagB, so written as a constraint, it'd be something like

1.B:flagB => 0.C ~ 1.C

So indeed, it's clear that if a conflict is arising in the context of (not) linking 1.C to 0.C, then 1.B:flagB should definitely be a goal in the conflict set. The question is how we get there. For normal goals, we just have a set of active constraints and if we find a conflict between two constraints, we create a conflict set from the two goals that introduced these constraints. I'm not quite sure how this translates to link groups.

kosmikus · 2016-04-18T18:41:09Z

@edsko Everything that is in this PR up until now looks ok to me.

23Skidoo · 2016-04-18T18:59:47Z

@edsko

Will this branch be merged into 1.24, if there's enough time?

I really hope so.

Since these patches only touch cabal-install code, they can always go into a patch-level release of cabal-install.

kosmikus · 2016-04-18T20:18:21Z

@23Skidoo That's true and good to know, but we are in the somewhat uncomfortable situation that we have merged certain new features (such as in particular the setup/component dependencies) that are currently quite fragile due to subtle bugs. Conflict-set related bugs mean that the solver is incomplete, i.e., it does not find solutions where there are any, and can be very difficult to work around.

On the one hand, it's fortunate that this only affects new features that aren't in wide use yet. On the other hand, it'd be nice not to spoil the experience that users have with these new features by shipping them in a state that is known to be broken.

I'm sorry that I have started looking at these so late. But unfortunately, the issues involved are quite subtle, and solving them properly requires a lot of thinking time. I think that after last week's discussions, @edsko and I are on a good track now, and I'm optimistic that with the help of @grayjay, the three of us can come up with a robust solution soon.

23Skidoo · 2016-04-18T20:44:21Z

@kosmikus @hvr told me that there will be a fourth RC before the GHC 8 release, so we probably can (yet again) postpone the 1.24 release for a short while.

kosmikus · 2016-04-18T20:55:51Z

@23Skidoo :) Ok, I'll try to keep the momentum up this time.

hvr · 2016-04-18T21:29:37Z

@23Skidoo Fwiw, RC4 is officially mentioned in the RC3 announcement: https://mail.haskell.org/pipermail/ghc-devs/2016-April/011862.html :-)

Solver test cases

Rebase fixLinkDeps onto solver branch

kosmikus · 2016-04-22T20:54:37Z

@grayjay Ok, thanks. @edsko, in case you agree, just hit the button :)

23Skidoo · 2016-04-22T21:14:08Z

@kosmikus Will there be any other solver fixes to merge into 1.24?

kosmikus · 2016-04-22T21:28:35Z

@23Skidoo If I'm keeping track correctly, then once this PR is merged, the following PRs can be closed because they have been integrated into / made obsolete by this one:

@edsko, @grayjay: Can you verify this list? Am I missing any PRs? Am I listing PRs that you think aren't fixed by this?

The only other possible candidate for 1.24 that I'm currently aware of and that isn't covered by this PR is #3290. If @edsko agrees that this one (#3357) is ready to merge, I'm planning to talk to him about what he thinks about #3290. I don't fully understand it yet (there are a number of changes outside of the solver, in code I'm less familiar with), and @ezyang seems to claim that it's still buggy ...

ezyang · 2016-04-22T22:02:15Z

I restarted the build to see if it's a transient failure. But looking at the logs, it just seems like Travis got slower recently.

grayjay · 2016-04-23T00:21:45Z

@kosmikus The list of PRs looks correct. I can close mine.

I looked at http://scribu.github.io/travis-stats/#haskell/cabal, and one thing that has increased build times on master is the solver quickcheck tests, since the solver needs to be compiled again. I guess that's another reason to split the solver into a library. Another option is to add the tests to unit-tests. The only problem is that the solver tests can take minutes to find bugs. The default number of tests doesn't take long to run, it just isn't very thorough.

edsko · 2016-04-23T03:39:38Z

Merged with master 😁 Let me just for the record extend a huge kudos once more to @grayjay -- you're a star!

@23Skidoo I have closed all the outstanding PRs that were subsumed by this branch. As @kosmikus says, the only one remaining is #3232/#3290, and I'm inclined to propose that we do not try to merge that before release. However, Andres mentioned that @hvr or @dcoutts said that that PR is actually needed for the nix-local-build stuff to be useable. I don't fully understand why that would the case, but if true, then it's a higher priority than I thought.

grayjay · 2016-04-23T03:48:22Z

@edsko Thanks!

kosmikus · 2016-04-23T05:02:56Z

Thanks to both @grayjay and @edsko for making this happen. Sorry again for letting the original PRs linger for so long, and thanks for nevertheless investing so much time now to make this happen pre-1.24 after all.

Regarding #3290: I think I had misunderstood which bug was responsible. Had a discussion with @dcoutts yesterday, and we verified that the main problem he was thinking about was a conflict set problem that is fixed by the solver branch. I would nevertheless still at least try to understand the current status of #3290 and the trade-offs involved before I make a final decision on whether to try to merge it pre-1.24 or not.

23Skidoo · 2016-04-23T14:35:49Z

@kosmikus @edsko @grayjay Thanks, everyone!

I'll wait for a verdict on #3290 from @kosmikus before releasing 1.24 final.

23Skidoo · 2016-04-23T21:50:23Z

@kosmikus

I've now backported the solver fixes from master to 1.24 (see #3373).

grayjay · 2016-04-23T22:23:52Z

@kosmikus @23Skidoo Thanks!

kosmikus · 2016-04-25T22:19:28Z

@23Skidoo I unfortunately haven't managed to talk to @edsko over the weekend or today. But having looked a bit at #3290 myself, I think I agree with @edsko that we should do that post-1.24. It all still looks somewhat confusing to me, and while it's without doubt nice to have, it doesn't seem to be a big blocker either.

23Skidoo · 2016-04-26T00:05:18Z

@kosmikus OK. Thanks for looking into this.

edsko · 2016-05-07T07:09:26Z

@grayjay I've emailed you with a link to a draft blog post about backjumping and conflict sets; I used the email address listed on your github profile. Let me know if this isn't the right address.

edsko and others added 4 commits April 17, 2016 10:54

Make ConflictSet abstract

1a1f1f9

This commit is a pure refactoring, no semantic changes. Submitting as separate PR at @kosmikus request. Tihs moves `ConflictSet` and `Var` into their own modules.

Add test case for backjumping after enforcing SIR

2cee0fe

23Skidoo added the cabal-install: solver label Apr 17, 2016

edsko and others added 12 commits April 17, 2016 21:42

Explain test in more detail

8ddf587

Add failing test case

91e2be1

Explain test case in a bit more detail

6a4aa92

Add test case for inconsistent dependency bug

98c7fef

Use tasty-expected-failure for failing unit test

8c22df4

Document test case

81b1f1a

Test that setup dependencies of linked packages are linked correctly

2f4c440

Add failing test case for conflict set bug

0fe6258

Sort solver unit tests

6835115

Add issue numbers to solver unit tests

2cfe4e3

Fix link deps

6617ca6

Expect test to pass

bafcd96

This was referenced Apr 18, 2016

Solver test cases #3359

Merged

Rebase fixLinkDeps onto solver branch #3360

Merged

edsko added 2 commits April 19, 2016 19:08

Merge pull request #3359 from grayjay/solver-test-cases

9bf8e81

Solver test cases

Merge pull request #3360 from grayjay/rebase-FixLinkDeps

6949def

Rebase fixLinkDeps onto solver branch

23Skidoo assigned edsko and unassigned kosmikus Apr 22, 2016

This was referenced Apr 23, 2016

Use larger conflict set when solver chooses wrong version for package in link group #3327

Closed

Correctly link setup dependencies of linked packages #3268

Closed

This was referenced Apr 23, 2016

Fix bug in link validation #3237

Closed

Fix construction of conflict sets during link validation #3234

Closed

Fix construction of conflict sets during check for single instance restriction (maybe) #3221

Closed

edsko merged commit a5f0301 into master Apr 23, 2016

grayjay mentioned this pull request Apr 23, 2016

Add quickcheck test to compare dependency solving with and without backjumping #3369

Merged

This was referenced Apr 23, 2016

Solver backjumps too far when linked packages have different flag assignments #2834

Closed

Conflict set is too small when solver enforces the single instance restriction #2843

Closed

23Skidoo deleted the solver branch April 23, 2016 14:15

23Skidoo mentioned this pull request Apr 23, 2016

Port solver fixes from master into 1.24 #3373

Merged

grayjay mentioned this pull request Apr 24, 2016

[nix-local-build] unbounded-delays Setup on GHC 8.0rc2 causing problems for nix-local-build #3248

Closed

grayjay mentioned this pull request May 6, 2016

Allow "cycles" through test suite dependencies #3402

Closed

grayjay mentioned this pull request Sep 5, 2016

Sandbox targets can prevent reinstallation of packages with setup dependencies #3203

Closed

Fix construction of conflict sets once and for all (WIP) #3357

Fix construction of conflict sets once and for all (WIP) #3357

Uh oh!

Conversation

edsko commented Apr 17, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

grayjay commented Apr 18, 2016

Uh oh!

edsko commented Apr 18, 2016

Uh oh!

kosmikus commented Apr 18, 2016

Uh oh!

kosmikus commented Apr 18, 2016

Uh oh!

23Skidoo commented Apr 18, 2016

Uh oh!

kosmikus commented Apr 18, 2016

Uh oh!

23Skidoo commented Apr 18, 2016

Uh oh!

kosmikus commented Apr 18, 2016

Uh oh!

hvr commented Apr 18, 2016

Uh oh!

kosmikus commented Apr 22, 2016

Uh oh!

23Skidoo commented Apr 22, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kosmikus commented Apr 22, 2016 • edited by edsko Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ezyang commented Apr 22, 2016

Uh oh!

grayjay commented Apr 23, 2016

Uh oh!

edsko commented Apr 23, 2016

Uh oh!

grayjay commented Apr 23, 2016

Uh oh!

kosmikus commented Apr 23, 2016

Uh oh!

23Skidoo commented Apr 23, 2016

Uh oh!

23Skidoo commented Apr 23, 2016

Uh oh!

grayjay commented Apr 23, 2016

Uh oh!

kosmikus commented Apr 25, 2016

Uh oh!

23Skidoo commented Apr 26, 2016

Uh oh!

edsko commented May 7, 2016

Uh oh!

Uh oh!

edsko commented Apr 17, 2016 •

edited

Loading

23Skidoo commented Apr 22, 2016 •

edited

Loading

kosmikus commented Apr 22, 2016 •

edited by edsko

Loading