Skip to content

Upgrade the version solver's algorithm #1758

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 10, 2018
Merged

Conversation

nex3
Copy link
Member

@nex3 nex3 commented Dec 8, 2017

This introduces a new algorithm I call "Pubgrub" which is based on
cutting-edge techniques for NP-hard search problems. It should
substantially reduce the number of cases where the version solver goes
exponential, and make it much easier to provide thorough error
messages when no solution is available.

This commit adds the core of the algorithm, but it's not yet feature-
complete with the old version solver. I intend to add support for
features like locked dependencies, downgrading, and so on in follow-up
CLs. It also doesn't have any kind of error handling yet. As such,
pub's test suite is not expected to pass.

See #912

@nex3 nex3 requested a review from munificent December 8, 2017 00:18
@nex3
Copy link
Member Author

nex3 commented Dec 8, 2017

I highly recommend reading the documentation before diving into the code.

@nex3 nex3 force-pushed the feature.solver.core branch from e891bfb to fb86625 Compare December 8, 2017 00:21
This introduces a new algorithm I call "Pubgrub" which is based on
cutting-edge techniques for NP-hard search problems. It should
substantially reduce the number of cases where the version solver goes
exponential, and make it much easier to provide thorough error
messages when no solution is available.

This commit adds the core of the algorithm, but it's not yet feature-
complete with the old version solver. I intend to add support for
features like locked dependencies, downgrading, and so on in follow-up
CLs. It also doesn't have any kind of error handling yet. As such,
pub's test suite is not expected to pass.

See #912
@nex3 nex3 force-pushed the feature.solver.core branch from fb86625 to 6404dd5 Compare December 13, 2017 03:53
doc/solver.md Outdated
[conflict resolution](#conflict-resolution), to produce `{not foo ^1.0.0,
foo >=2.0.0, bar any ∪ not bar ^1.0.0}`. Since `not not bar ^1.0.0 = bar ^1.0.0`
satisfies `bar any`, this simplifies to `{not foo ^1.0.0, foo >=2.0.0}`. We can
then take the intersection of the two `foo` terms to get `{foo >=2.0.0}`.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

doc/solver.md Outdated
conflict resolution has no root cause to find and just backtracks to decision
level 3, where it can make a new derivation:

| 19 | `not shared ^1.0.0` | derivation | unit propagation | step 18 | 3 |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is not formatted as a table

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

doc/solver.md Outdated

There are two sources of incompatibilities. They may represent facts about
packages—for example, "`foo ^1.0.0` depends on `bar ^2.0.0`" is represented as
the incompatibility `{not foo ^1.0.0, bar ^2.0.0}`, while "`foo <1.3.0` has an
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the first example the not seems to be on the other side. Shouldn't it be {foo ^1.0.0, not bar ^2.0.0} ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

@munificent munificent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bunch of minor suggestions. I won't pretend to understand any but a fraction of this, but let's land it and go forward. Fantastic docs. :)

doc/solver.md Outdated

At a high level, Pubgrub works like many other search algorithms. Its core loop
involves speculatively chooses package versions that match outstanding
dependencies in a loop. Eventually one of two things happens:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to:

Its core loop involves speculatively choosing package versions that match outstanding dependencies.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

doc/solver.md Outdated
When a conflict is found, Pubgrub backtracks to the package that caused the
conflict and chooses a different version. However, unlike many search
algorithms, it also records the root cause of that conflict. This is the
"conflict-driven clause learning" that lends CDCl its name.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"CDCl" -> "CDCL".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

doc/solver.md Outdated

We say that a term `t` is "satisfied" by a set of terms `S` if `t` must be true
whenever every term in `S` is true. Conversely, `t` is "contradicted" by `S` if
it must be false whenever every term in `S` is true. If neither of these is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These first two sentences might be easier to read if you put them in active voice like the examples below. So "t is satisfied by S" -> "S satisfies t" and ditto for contradicts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

doc/solver.md Outdated

* `foo ^1.0.0 ∪ foo ^2.0.0` is `foo >=1.0.0 <3.0.0`.
* `foo >=1.0.0 ∩ not foo >=2.0.0` is `foo ^1.0.0`.
* `foo ^1.0.0 \ foo ^1.5.0` is `foo >=1.0.0 <1.5.0`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is \ set subtraction? It would be good to explain what each operator is for those not familiar with the syntax (and especially because different fields tend to use different notation for the same operation).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a link to the standard.

doc/solver.md Outdated

## Incompatibility

An incompatibility is a set of terms that cannot all be true—that is, at least
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me quite a while to understand this. I thought "incompatibility" meant that the set itself indicated some sort of failure, and not that the incompatibility was a goal.

Maybe say:

An incompatibility is a set of terms that are not all allowed to be true. An incompatibility considers a set of package versions to be a valid solution if at least one of the incompatibility's terms is false. Incompatibilities are context-independent, meaning that their terms are mutually incompatible regardless of which versions are selected at any given point in time.

?

The name "incompatibility" is a little odd because it sounds like a solver failure. It's basically a NAND operation, right? What do you think of "refutation" or "exclusion"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intuition here is that they represent states of the package graph that may be true independently but are mutually incompatible with one another. For example, the incompatibility {foo ^1.0.0, bar ^2.0.0} means that foo ^1.0.0 isn't compatible with bar ^2.0.0. In a sense, it is a failure, because a partial solution that satisfies an incompatibility is a failed solution.

I understand why this may be kind of unintuitive, but my sense is that we're operating at a level of abstraction (not to mention layered negations) where no name will perfectly capture what's going on. I'll see if I can adapt your suggested wording to make it clearer, though.

What do you think of "refutation" or "exclusion"?

"Refutation" generally refers to a proof that a statement is false, and this is too close to proof theory for me to want to step on the toes of that terminology. "Exclusion" is better, but I don't know that it communicates as well that only the terms in conjunction are a problem, and that they may (and probably will) be fine in isolation.

The textbook calls them "nogoods", which we could also use, but I'm not a big fan. For one thing, it's not really a word, and I don't think it provides any real intuition about what's going on.

}).toList();
}

/// Returns a version bound indicating where in [versions] this package
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[versions] -> [_versions].

Below too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

/// [index].
///
/// If a package is absent from the return value, that indicates indicate that
/// all versions above or below [index] (according to [upper]) have the same dependency.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long line, I think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

///
/// If multiple subsequent versions of this package have the same
/// dependencies, this will return incompatibilities that reflect that. It
/// won't return incompatibilities that have already been returned by a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method name reads as stateless to me, so it was surprising that it produces different results on each call. Maybe nextIncompatibilitiesFor()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in sort of a gray area. It's not strictly stateless in the sense that multiple calls with the same input won't produce the same output, but it's not exactly stateful in the sense that different calls don't affect the semantics of what's being returned. The caller doesn't get any new information from calling this again with the same id (which is why I don't love "next" as a descriptor), and the fact that the old information isn't repeated is kind of an implementation detail.

/// assignments made after that level.
void backtrack(int decisionLevel) {
// When we start backtracking, count an additional attempted solution. If we
// backtrack multiple times in a row, though, we only want to count one.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarified.

// conflicts.
for (var incompatibility in _incompatibilities[package].reversed) {
var result = _propagateIncompatibility(incompatibility);
if (result == #conflict) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A symbol literal?

I think this would be clearer if it returned some enum-ish class that contained the string when that case is matched. Also, document when it may return a string.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be clearer if it returned some enum-ish class that contained the string when that case is matched.

Making this type-safe is sooooo much boilerplate for one private function 😩. It takes three separate classes to define a type that has both enum members and members that have values. I'll do it if you really think it's important, but I'd much rather be terse if possible.

My kingdom for algebraic structs.

Also, document when it may return a string.

Done.

This ensures that we don't have any line breaks in the middle of
inline code, which causes GitHub's markdown renderer to choke.
Copy link
Member Author

@nex3 nex3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL

doc/solver.md Outdated

At a high level, Pubgrub works like many other search algorithms. Its core loop
involves speculatively chooses package versions that match outstanding
dependencies in a loop. Eventually one of two things happens:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

doc/solver.md Outdated
When a conflict is found, Pubgrub backtracks to the package that caused the
conflict and chooses a different version. However, unlike many search
algorithms, it also records the root cause of that conflict. This is the
"conflict-driven clause learning" that lends CDCl its name.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

doc/solver.md Outdated

We say that a term `t` is "satisfied" by a set of terms `S` if `t` must be true
whenever every term in `S` is true. Conversely, `t` is "contradicted" by `S` if
it must be false whenever every term in `S` is true. If neither of these is
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

doc/solver.md Outdated

* `foo ^1.0.0 ∪ foo ^2.0.0` is `foo >=1.0.0 <3.0.0`.
* `foo >=1.0.0 ∩ not foo >=2.0.0` is `foo ^1.0.0`.
* `foo ^1.0.0 \ foo ^1.5.0` is `foo >=1.0.0 <1.5.0`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a link to the standard.

doc/solver.md Outdated

## Incompatibility

An incompatibility is a set of terms that cannot all be true—that is, at least
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intuition here is that they represent states of the package graph that may be true independently but are mutually incompatible with one another. For example, the incompatibility {foo ^1.0.0, bar ^2.0.0} means that foo ^1.0.0 isn't compatible with bar ^2.0.0. In a sense, it is a failure, because a partial solution that satisfies an incompatibility is a failed solution.

I understand why this may be kind of unintuitive, but my sense is that we're operating at a level of abstraction (not to mention layered negations) where no name will perfectly capture what's going on. I'll see if I can adapt your suggested wording to make it clearer, though.

What do you think of "refutation" or "exclusion"?

"Refutation" generally refers to a proof that a statement is false, and this is too close to proof theory for me to want to step on the toes of that terminology. "Exclusion" is better, but I don't know that it communicates as well that only the terms in conjunction are a problem, and that they may (and probably will) be fine in isolation.

The textbook calls them "nogoods", which we could also use, but I'm not a big fan. For one thing, it's not really a word, and I don't think it provides any real intuition about what's going on.

}).toList();
}

/// Returns a version bound indicating where in [versions] this package
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

///
/// If [inclusive] is `true`, this returns the last version in [versions] with
/// an incompatible SDK constraint. If it's `false`, it returns the
/// first version in [versions] with an compatible SDK constraint.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

/// [index].
///
/// If a package is absent from the return value, that indicates indicate that
/// all versions above or below [index] (according to [upper]) have the same dependency.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

/// assignments made after that level.
void backtrack(int decisionLevel) {
// When we start backtracking, count an additional attempted solution. If we
// backtrack multiple times in a row, though, we only want to count one.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarified.

// conflicts.
for (var incompatibility in _incompatibilities[package].reversed) {
var result = _propagateIncompatibility(incompatibility);
if (result == #conflict) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be clearer if it returned some enum-ish class that contained the string when that case is matched.

Making this type-safe is sooooo much boilerplate for one private function 😩. It takes three separate classes to define a type that has both enum members and members that have values. I'll do it if you really think it's important, but I'd much rather be terse if possible.

My kingdom for algebraic structs.

Also, document when it may return a string.

Done.

@nex3 nex3 force-pushed the feature.solver.core branch from 8392779 to 5c5d9e4 Compare January 10, 2018 22:12
Copy link
Member

@munificent munificent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@nex3 nex3 merged commit a7df959 into feature.solver Jan 10, 2018
@nex3 nex3 deleted the feature.solver.core branch January 10, 2018 22:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants