Skip to content

[Feature] RFC: Package Variants #2751

Open
@arcanis

Description

@arcanis

Describe the user story

Various packages want to ship their releases under different forms. For example:

  • Native packages often offer win32/osx/linux prebuilt packages.
  • Source packages can be downloaded from the repository's tag.
  • ESM or CJS versions of a package.
  • With / without documentation.

The current way they achieve that is by the use of third-party tools like node-pre-gyp, prebuild-install, or manual downloads. Those solutions aren't ideal since:

  • They are a lot of code that should in theory "belong" to the package manager
  • They don't always leverage the package manager project configuration
  • They may involve running installs while an install is already in progress, thus risking file corruption
  • They don't integrate with the package manager caches, meaning they have to roll their own
  • They are difficult to interact with (the dependent package cannot force the use of a specific variant)
  • They're typically only meant for the native use case

Describe the solution you'd like

Any design has to fit some goals:

  • It should allow different parts of the dependency tree to request different flavors of a same package. Our dependency trees are currently context-free, and any solution should keep it that way.
  • It should degrade gracefully. Older package managers should all do something acceptable when faced with packages providing variants.

Additionally, I believe the following goals should also be followed:

  • It shouldn't be turing-complete. More complex support can be extended if the need arises, but for the first iteration I'd start with a statically analyzable format and go from there.
  • It should integrate with the caching. In particular, it should be possible to cache multiple variants of a package based on global setting (otherwise caches would be different on Linux vs Windows).

The syntax I currently have in mind is based on the matrix feature for the GitHub actions. Specifically, the package author defines a list of parameter combinations the package offers variants for, and a "package name pattern":

"variants": {
    "pattern": "@prisma/prisma-build-%platform-%napi",
    "include": [{
        "platform": "win32",
        "napi": "6",
    }, {
        "platform": "win32",
        "napi": "4",
    }, {
        "platform": "darwin",
        "napi": "6",
    }, {
        "platform": "darwin",
        "napi": "4",
    }],
},

The package manager would then detect this variants field and find the first entry from the include set that match runtime parameters (runtime parameters would by default be a static list, but users could add their own; more on that later). For example it would resolve to platform:win32 and napi:4. It would then turn that into @prisma/prisma-build-win32-4 (based on the provided pattern), and finally would fetch it and use it as if the user had listed "prisma": "npm:@prisma/[email protected]" from the start.

Note: The package name specified in pattern is required to define a scope as to avoid security issues that could arise if a malicious actor was to register a package name covered by a combination of parameters for a legit package. While it could in theory be detected by the registry, requiring a scope circumvents the issue by enforcing that the builtin packages must belong to a specific owner.

To make declaring combinations simpler and avoid exponential complexity, a matrix property would also be supported. The package manager would use it to generate by itself the supported parameter sets:

"variants": {
    "matrix": {
        "platform": {
            "candidates": [
                "darwin",
                "win32",
            ],
        },
        "napi": {
            "candidates": [
                "5",
                "6",
            ],
        },
    },
},

The include and exclude properties would thus only be used to declare special cases (here, to indicate that there's a wasm build, and that the package doesn't work on Win32 w/ NAPI 4:

"variants": {
    "pattern": "@prisma/prisma-build-%platform-%napi",
    "exclude": [{
        "platform": "win32",
        "napi": "4",
    }],
    "include": [{
        "platform": "wasm",
        "napi": null,
    }],
},

Should the package manager be unable to find a supported parameter set, it would fallback to the current package (in other words, it would ignore the variants field). This is critical, because it's the "graceful degradation" feature I mentioned:

  • Library authors will be able to add the variants field to their already-existing packages without having to create a separate package name (that would hurt adoption).
  • Older package managers would fallback to the current package, making it .

One note though: assuming that fallback packages are "fat" packages that try to build the code from source, then in order to keep the install slim for modern package managers the library authors will likely need to follow a pattern such as this one:

{
    "name": "prisma",
    "variants": {},
    "dependencies": {
        "prisma-fallback": "..."
    }
}

This would ensure that prisma-fallback would only be fetched if actually needed.

Custom parameters

Packages would be able to provide their own custom parameters:

"variants": {
    "pattern": "@lodash/lodash-%js",
    "include": [{
        "js": "esm",
    }, {
        "js": "cjs",
    }],
},

Those custom parameters would be expected to be set by the dependent via the dependencyMeta field:

{
    "dependencies": {
        "lodash": "..."
    },
    "dependencyMeta": {
        "lodash": {
            "parameters": {
                "js": "esm"
            }
        }
    }
}

Using * instead of lodash as dependencyMeta key would also be supported as shortcut ("all my dependency should be esm").

Cache integration

The user would be able to define in their .yarnrc.yml which parameters should be cached, following the same format as the package parameters. Specifying this wouldn't be required (in which case only the packages covered by the local install would be cached, as one would expect):

cacheParameters:
  matrix:
    platform: [win32, osx]
    napi: [6, 4]

Parameter cascade

Cascades would be explicitly denoted by the %inherit value:

{
    "name": "lodash",
    "dependencies": {
        "subLibrary": "..."
    },
    "dependencyMeta": {
        "subLibrary": {
            "parameters": {
                "js": "%inherit"
            }
        }
    }
}

Describe the drawbacks of your solution

It seems clear the main drawback is verbosity. The amount of lines required may look suboptimal. Keep in mind however that I intentionally kept the code expanded; in practice the code would be shortened (one line per parameter, etc).

It requires to publish multiple variants of the same package to the registry. This is however the very reason for this feature to exist (ie not put every artifact in the same downloaded package), so I don't see it as a drawback per-se.

It also requires those variants to be synchronized in terms of version with the original package (ie, if prisma is 1.2.3, then the prebuilt versions will have to be @prisma/[email protected] as well). This will be a matter of tooling (for example by making yarn npm publish accept a --name @prisma/prisma-prebuilt-%platform-%napi flag which would override the name originally declared in the manifest).

Describe alternatives you've considered

Another approach would be to simply leave it to userland. The problem is that userland doesn't have the right tools to interact with the package manager at the level it needs, which makes current solutions awkward at best.

We could also decrease the verbosity by using JS files instead of being manifest metadata. I don't believe this would work because:

  • it would require to fetch the package sources in order to know if we're going to need it (bandwidth waste)
  • it would require to execute a postinstall script (which we tend to discourage, and may disable by default down the road)
  • it would prevent static analysis of the dependencies (which I think is one of the main advantages of package.json)
  • it would ultimately only move the verbosity from the manifest to those scripts

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestrfcThis issue is an RFC.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions