Skip to content

[BUG] v7 hidden lockfile/node_modules inconsistency, duplicated packages #1597

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
isaacs opened this issue Jul 31, 2020 · 2 comments
Closed

Comments

@isaacs
Copy link
Contributor

isaacs commented Jul 31, 2020

Repro script:

# run in npm/cli repo
set -x
rm -rf node_modules package-lock.json
git checkout node_modules
rm node_modules/.gitignore node_modules/.package-lock.json
node . i --no-audit --ignore-scripts
node . ls @babel/core # apparently ok, only bundled under tap
rm node_modules/.package-lock.json # remove the hidden lockfile
node . ls @babel/core # ohno! babel cores all over the place!
  1. Why is the hidden lockfile being respected, when there are clearly package folders not found in its list?
  2. Why is Arborist installing @babel/core v7.11 when the only thing depending on it has a copy from the bundle?
@isaacs
Copy link
Contributor Author

isaacs commented Jul 31, 2020

It looks like they're getting put in the hidden lockfile, but not marked as extraneous (even though they should be). But, since they're not actually reachable via the edges, they don't show up in the list. So ls.js is doing the right thing, and printing what arborist gives it.

It looks like it's placing those deps in the buildIdealTree process, but it shouldn't be, because they are deps of a bundled dep.

I'm thinking it's because tap depends on something it's not bundling that depends on @babel/core (via istanbul or jsx, I'm guessing?). So it gets placed, but then later overridden by the bundled copy (from ink), and it doesn't note that the original one needs to be re-evaluated for extraneousness.

isaacs added a commit that referenced this issue Jul 31, 2020
The `tap` package bundles `import-jsx`, which in turn bundles
`@babel/core` and a bunch of other stuff.

However, even though `@babel/core` is in the bundle, it's getting placed
in the tree at the root.  This is a mistake.

Because the node is placed, and not given the `extraneous` flag, in the
ideal tree building process, when reified, it goes ahead and puts it in
place.  When reading the actual tree, we see a valid hidden lockfile,
which contains the module, and it doens't have the extraneous flag.  So
everything seems fine.

Remove the hidden lockfile, and the problem presents itself.

Corrected here through some arduous and painstaking manual installs, so
at least we have the tree in the state it *should* be for now.  But this
is a bug in arborist that must be fixed asap.

Re: #1597
isaacs added a commit that referenced this issue Jul 31, 2020
The `tap` package bundles `import-jsx`, which in turn bundles
`@babel/core` and a bunch of other stuff.

However, even though `@babel/core` is in the bundle, it's getting placed
in the tree at the root.  This is a mistake.

Because the node is placed, and not given the `extraneous` flag, in the
ideal tree building process, when reified, it goes ahead and puts it in
place.  When reading the actual tree, we see a valid hidden lockfile,
which contains the module, and it doens't have the extraneous flag.  So
everything seems fine.

Remove the hidden lockfile, and the problem presents itself.

Corrected here through some arduous and painstaking manual installs, so
at least we have the tree in the state it *should* be for now.  But this
is a bug in arborist that must be fixed asap.

Re: #1597
@isaacs
Copy link
Contributor Author

isaacs commented Aug 2, 2020

So here's what's happening.

  • A package x depends on y and z. y is in bundleDependencies.
  • y depends on z, so it's also in the bundle, but not listed in bundleDependencies.
  • When x is installed, the buildIdealTree process skips trying to resolve z, because it's in bundleDependencies. However, it still tries to place z, because until the tarball is cracked open, it doesn't know that it's already in there.
  • Because it's not extraneous at the time of creating the idealTree (and that's what serves as the source of truth for diffs and saved to the package locks, including the hidden lockfile in node_modules/.package-lock.json), it's not marked extraneous.
  • If we have a hidden lockfile, we don't re-evaluate dependency flags when loading the actual tree. (That's kind of the point, after all, to save added work!)
  • As a result, npm ls doesn't see that there are any extraneous deps, but when it walks the graph of edgesOut, it also never encounters those nodes, so it doesn't print them.
  • Remove the node_modules/.package-lock.json file, and voila, extraneous deps appear.

This doesn't happen if you run npm i --package-lock-only before running npm i, because --package-lock-only requires a buildIdealTree({complete:true}) run. In that case, buildIdealTree cracks open the tarball in a temp dir, and builds the complete tree from the bundling node, and only resolves edges that are missing (ie, not bundled). And if there's a package lock, we use that as the idealTree, so everything stays proper.

Possible fix approaches:

  1. Always have buildIdealTree crack open tarballs to read bundles, if present. (Ie, get rid of the "only unpack once" shortcut trick we do in reify with bundling deps). Simplest, correct in all cases, least efficient.
  2. After reify() unpacks bundles (which it does ahead of reifying anything else), have it scan the tree for any nodes that have the same name as something in the bundle, and ensure we can get a path back to the root from them. If not found, then remove them from the tree. Pretty complicated, but should also be correct in all cases, and while it does add some extra compute work, it's not fs or networking work at least. We could make it somewhat efficient by doing something like the following:
    • mark all such nodes as extraneous, and put them in a list
    • remove any nodes with an edgesIn from a non-extraneous node
    • repeat until list is empty or no changes are made
    • remove all remaining nodes from the tree
  3. Never place deps above a bundling node when building the ideal tree. This is inefficient in disk space, and could end up still producing dupes (eg, if the bundled y in the example above was installed using --legacy-bundling or --global-style, so that z would show up once in the bundle under x/nm/y/nm/z and another one newly placed at x/nm/z.)

Note: there is a call to _idealTreePrune when we load the bundles in reify(), but since the nodes didn't get flagged as extraneous (since they weren't extraneous when placed!), it doesn't find them. There's even a TODO comment saying to find a better way to do this, so I guess now's the time.

isaacs added a commit that referenced this issue Aug 2, 2020
isaacs added a commit that referenced this issue Aug 3, 2020
isaacs added a commit that referenced this issue Aug 4, 2020
The `tap` package bundles `import-jsx`, which in turn bundles
`@babel/core` and a bunch of other stuff.

However, even though `@babel/core` is in the bundle, it's getting placed
in the tree at the root.  This is a mistake.

Because the node is placed, and not given the `extraneous` flag, in the
ideal tree building process, when reified, it goes ahead and puts it in
place.  When reading the actual tree, we see a valid hidden lockfile,
which contains the module, and it doens't have the extraneous flag.  So
everything seems fine.

Remove the hidden lockfile, and the problem presents itself.

Corrected here through some arduous and painstaking manual installs, so
at least we have the tree in the state it *should* be for now.  But this
is a bug in arborist that must be fixed asap.

Re: #1597
isaacs added a commit that referenced this issue Aug 4, 2020
isaacs added a commit that referenced this issue Aug 4, 2020
The `tap` package bundles `import-jsx`, which in turn bundles
`@babel/core` and a bunch of other stuff.

However, even though `@babel/core` is in the bundle, it's getting placed
in the tree at the root.  This is a mistake.

Because the node is placed, and not given the `extraneous` flag, in the
ideal tree building process, when reified, it goes ahead and puts it in
place.  When reading the actual tree, we see a valid hidden lockfile,
which contains the module, and it doens't have the extraneous flag.  So
everything seems fine.

Remove the hidden lockfile, and the problem presents itself.

Corrected here through some arduous and painstaking manual installs, so
at least we have the tree in the state it *should* be for now.  But this
is a bug in arborist that must be fixed asap.

Re: #1597
isaacs added a commit that referenced this issue Aug 4, 2020
isaacs added a commit that referenced this issue Aug 4, 2020
The `tap` package bundles `import-jsx`, which in turn bundles
`@babel/core` and a bunch of other stuff.

However, even though `@babel/core` is in the bundle, it's getting placed
in the tree at the root.  This is a mistake.

Because the node is placed, and not given the `extraneous` flag, in the
ideal tree building process, when reified, it goes ahead and puts it in
place.  When reading the actual tree, we see a valid hidden lockfile,
which contains the module, and it doens't have the extraneous flag.  So
everything seems fine.

Remove the hidden lockfile, and the problem presents itself.

Corrected here through some arduous and painstaking manual installs, so
at least we have the tree in the state it *should* be for now.  But this
is a bug in arborist that must be fixed asap.

Re: #1597
isaacs added a commit that referenced this issue Aug 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant