Count conflicts to speed up solver #3513

kosmikus · 2016-06-30T11:12:16Z

This change introduces a new flag --count-conflicts (enabled by default, can be disabled by saying --no-count-conflicts).

The idea is simple: while solving, we maintain a table that counts how often each goal (roughly: each package) has been involved in a conflict; whenever we make choices about what goal to solve next, we consult the table and prefer goals that have been involved in conflicts most often. Why is that good? Because it means the solver will try hard to spend time solving "difficult goals" that are most likely to cut off large amounts of the search space.

All tests for this have been immensely promising. Unlike the already existing --reorder-goals, using --count-conflicts has hardly any overhead. While there are some packages where --count-conflicts helps more and some where it helps less, it barely ever slows down solving. It barely ever affects the install plan being found (although in theory, it can, because changing the goal order can always lead to other install plans). It quite often leads to dramatic speedups though.

The universe package mentioned by @phadej in #3453 is one such example, were solver time goes from 160 to 2 seconds.

All in all, I'm confident enough about this to immediately make it the default.

The original problem in #3453 is unfortunately not solved by this, but I believe --count-conflicts is still an important step to solving that scenario too. The remaining problem is that some things interact badly with --count-conflicts:

Some of the heuristics in the solver also interact with the goal search order. In particular, there is a heuristic that "defers" solving certain flags, which means they're removed from the goal set. Once they're removed, they can no longer be preferred, even if they're at the core of a complex conflict. The solution is to make deferral less absolute, which is a next step.
Using --reorder-goals together with --count-conflicts is not recommended. Both strategies try to have the final word on the goal order, and they don't combine well. So one should use one or the other. I haven't seen many (any?) cases so far where --reorder-goals was better than --count-conflicts.

kosmikus · 2016-06-30T11:12:33Z

/cc @grayjay @edsko @dcoutts @23Skidoo

phadej · 2016-06-30T12:29:57Z

Seems that tests need updating.

kosmikus · 2016-07-02T13:54:22Z

Rebased.

grayjay · 2016-07-03T03:20:29Z

I think this is a great improvement, and I agree that it should be the default.

I was wondering if there might be a better way to handle the ConflictMap state. I spent some time trying to understand the performance implications of adding a state monad to tree traversal, because I made a similar change in #2917. It seems like a space leak can easily develop when there is a pair holding a large data structure (solver log) and the state calculated from traversing that data structure. (The program can retain the pair and the whole data structure when it references the final state.)

I experimented with instead putting the state at the end of the log: grayjay@e7b2c3c. I think that the space behavior is more predictable, but it's not as clean. The state monad may be the best solution. I just wanted to put another idea out there.

I also looked at the change in the output of the solver unit tests, to check for tests that I missed in #3510. The only test that showed a change in goal order was indepGoals5. I could make a PR to freeze the goal order for that test, as well as indepGoals6, which looks similar.

Test log diff:

-  UnitTests.Distribution.Solver.Modular.Solver
-    Independent goals
 targets: X, Y
 constraints: 
-      indepGoals5:   stanzas X test (unknown source)
+  UnitTests.Distribution.Solver.Modular.Solver
+  stanzas X test (unknown source)
   stanzas Y test (unknown source)
   stanzas A test (unknown source)
+    Independent goals
   stanzas A test (unknown source)
   stanzas B test (unknown source)
   stanzas C test (unknown source)
-  stanzas C test (unknown source)
+      indepGoals5:   stanzas C test (unknown source)
 preferences: 
 strategy: PreferLatestForSelected
 reorder goals: ReorderGoals False
+count conflicts: CountConflicts True
 independent goals: IndependentGoals True
 avoid reinstalls: AvoidReinstalls False
 shadow packages: ShadowPkgs False
 strong flags: StrongFlags False
 max backjumps: infinite
 [__0] trying: 0.X-1.0.0 (user goal)
 [__1] trying: 0.A-2.0.0 (dependency of 0.X-1.0.0)
 [__2] trying: 0.B-1.0.0 (dependency of 0.A-2.0.0)
 [__3] trying: 0.C-2.0.0 (dependency of 0.X-1.0.0)
 [__4] trying: 1.Y-1.0.0 (user goal)
 [__5] trying: 1.A~>0.A-2.0.0 (dependency of 1.Y-1.0.0)
 [__6] trying: 1.B~>0.B-1.0.0 (dependency of 1.A-2.0.0)
 [__7] next goal: 1.C (dependency of 1.Y-1.0.0)
 [__7] rejecting: 1.C~>0.C-2.0.0 (conflict: 1.Y => 1.C==1.0.0)
 [__7] rejecting: 1.C-2.0.0 (multiple instances)
 [__7] rejecting: 1.C-1.0.0 (dependencies not linked: cannot make 1.C canonical member of {*0.C-2.0.0,1.C-2.0.0}; conflict set: 0.C, 1.B, 1.C)
 [__7] fail (backjumping, conflict set: 0.C, 1.B, 1.C, 1.Y)
 [__6] rejecting: 1.B-1.0.0 (multiple instances)
 [__6] fail (backjumping, conflict set: 0.B, 0.C, 1.A, 1.B, 1.C, 1.Y)
 [__5] rejecting: 1.A-2.0.0 (multiple instances)
 [__5] rejecting: 1.A-1.0.0 (conflict: 1.Y => 1.A==2.0.0)
 [__4] fail (backjumping, conflict set: 0.A, 0.B, 0.C, 1.A, 1.B, 1.C, 1.Y)
 [__3] rejecting: 0.C-1.0.0 (conflict: 0.X => 0.C==2.0.0)
 [__2] fail (backjumping, conflict set: 0.A, 0.B, 0.C, 0.X, 1.A, 1.B, 1.C, 1.Y)
 [__1] trying: 0.A-1.0.0
 [__2] trying: 0.C-2.0.0 (dependency of 0.X-1.0.0)
 [__3] trying: 1.Y-1.0.0 (user goal)
-[__4] next goal: 1.A (dependency of 1.Y-1.0.0)
-[__4] rejecting: 1.A~>0.A-1.0.0 (conflict: 1.Y => 1.A==2.0.0)
-[__4] trying: 1.A-2.0.0
-[__5] trying: 1.B-1.0.0 (dependency of 1.A-2.0.0)
-[__6] next goal: 1.C (dependency of 1.Y-1.0.0)
-[__6] rejecting: 1.C~>0.C-2.0.0 (conflict: 1.Y => 1.C==1.0.0)
-[__6] rejecting: 1.C-2.0.0 (multiple instances)
-[__6] trying: 1.C-1.0.0
+[__4] next goal: 1.C (dependency of 1.Y-1.0.0)
+[__4] rejecting: 1.C~>0.C-2.0.0 (conflict: 1.Y => 1.C==1.0.0)
+[__4] rejecting: 1.C-2.0.0 (multiple instances)
+[__4] trying: 1.C-1.0.0
+[__5] next goal: 1.A (dependency of 1.Y-1.0.0)
+[__5] rejecting: 1.A~>0.A-1.0.0 (conflict: 1.Y => 1.A==2.0.0)
+[__5] trying: 1.A-2.0.0
+[__6] next goal: 1.B (dependency of 1.A-2.0.0)
+[__6] trying: 1.B-1.0.0
 [__7] done
 OK

I'll make some comments on the code.

grayjay · 2016-07-03T03:21:41Z

cabal-install/Distribution/Client/Setup.hs

@@ -697,6 +701,7 @@ defaultFreezeFlags = FreezeFlags {
    freezeSolver           = Flag defaultSolver,
    freezeMaxBackjumps     = Flag defaultMaxBackjumps,
    freezeReorderGoals     = Flag (ReorderGoals False),
+    freezeCountConflicts   = Flag (CountConflicts False),


Shouldn't fetch and freeze use the same default solver parameters as install?

(It already was the default for install.)

grayjay · 2016-07-04T03:15:24Z

@kosmikus Thanks for adding the comments.

I made a PR to handle the unit test that was affected by this change: #3517

edsko · 2016-07-04T03:26:24Z

I haven't followed the details of this thread, but FWIW:

I was wondering if there might be a better way to handle the ConflictMap state. I spent some time trying to understand the performance implications of adding a state monad to tree traversal, because I made a similar change in #2917. It seems like a space leak can easily develop when there is a pair holding a large data structure (solver log) and the state calculated from traversing that data structure. (The program can retain the pair and the whole data structure when it references the final state.)

I experimented with instead putting the state at the end of the log: grayjay/cabal@e7b2c3c. I think that the space behavior is more predictable, but it's not as clean. The state monad may be the best solution. I just wanted to put another idea out there.

As it happens I have experimented a lot with this sort of thing recently; have a half written blog post and a lot of experimental data that I have no time to put into some kind of coherent state :/ Anyway, in my opinion, putting the state at the end of the log both the most predictable and in fact the cleanest solution. It is very very easy to make mistakes and have memory blow up when using a monadic approach with this sort of thing.

kosmikus · 2016-07-04T07:39:22Z

@grayjay Had another look at your way to treat the conflict map, and I agree with @edsko that it's actually better to do it this way. So I cherry-picked your change and added it to this PR.

grayjay · 2016-07-04T19:07:19Z

LGTM

kosmikus · 2016-07-04T19:58:57Z

Ok. Merged.

@23Skidoo This one's probably tricky to merge into 1.24 because of the solver module reorganisation. On the other hand, it is a significant speedup for many situations, so depending on how far off 1.26 is, I might be willing to do the work if you'd be willing to put it in 1.24 as well.

23Skidoo · 2016-07-05T22:06:36Z

@kosmikus I want to release 1.26 before the end of the year, likely in November/December.

kosmikus added cabal-install: solver cabal-install: other labels Jun 30, 2016

kosmikus added 5 commits July 2, 2016 15:09

Count conflicts and prefer goals that contribute to conflicts.

fba4b25

Add flag to selectively disable count-conflicts.

84727b7

Switch to using state monad.

a17883c

Better way to prevent space leak for --no-count-conflicts.

29de7db

Fix tests.

0c12435

kosmikus force-pushed the count-conflicts-2 branch from 1dfa8ba to 0c12435 Compare July 2, 2016 13:53

grayjay reviewed Jul 3, 2016
View reviewed changes

kosmikus added 3 commits July 3, 2016 07:19

Remove unused language pragma.

221c309

Make --count-conflicts the default for fetch/freeze.

eed12c6

(It already was the default for install.)

Extend comments on goal choice heuristics.

7393f38

grayjay and others added 2 commits July 4, 2016 09:15

Remove state monad, and store state at the end of the log.

7b21703

Cleaning up.

45c38b9

kosmikus merged commit e08d602 into haskell:master Jul 4, 2016

grayjay mentioned this pull request Feb 5, 2017

Solver is often stumped by restrictive upper bounds #1783

Closed

grayjay mentioned this pull request May 2, 2023

Consider removing the "count conflicts" solver feature, since it may be redundant with "fine grained conflicts" #8931

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Count conflicts to speed up solver #3513

Count conflicts to speed up solver #3513

Uh oh!

kosmikus commented Jun 30, 2016 •

edited

Loading

Uh oh!

kosmikus commented Jun 30, 2016

Uh oh!

phadej commented Jun 30, 2016

Uh oh!

kosmikus commented Jul 2, 2016

Uh oh!

grayjay commented Jul 3, 2016

Uh oh!

grayjay Jul 3, 2016

Uh oh!

grayjay commented Jul 4, 2016

Uh oh!

edsko commented Jul 4, 2016

Uh oh!

kosmikus commented Jul 4, 2016

Uh oh!

grayjay commented Jul 4, 2016

Uh oh!

kosmikus commented Jul 4, 2016

Uh oh!

23Skidoo commented Jul 5, 2016

Uh oh!

Uh oh!

Count conflicts to speed up solver #3513

Count conflicts to speed up solver #3513

Uh oh!

Conversation

kosmikus commented Jun 30, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kosmikus commented Jun 30, 2016

Uh oh!

phadej commented Jun 30, 2016

Uh oh!

kosmikus commented Jul 2, 2016

Uh oh!

grayjay commented Jul 3, 2016

Uh oh!

grayjay Jul 3, 2016

Choose a reason for hiding this comment

Uh oh!

grayjay commented Jul 4, 2016

Uh oh!

edsko commented Jul 4, 2016

Uh oh!

kosmikus commented Jul 4, 2016

Uh oh!

grayjay commented Jul 4, 2016

Uh oh!

kosmikus commented Jul 4, 2016

Uh oh!

23Skidoo commented Jul 5, 2016

Uh oh!

Uh oh!

kosmikus commented Jun 30, 2016 •

edited

Loading