-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Description
Technically, README files are not supposed to contain relative URLs, because the README file itself doesn't have its own base URL after being published.
In practice it's usually expected that relative URLs in READMEs will be rewritten the same way GitHub does it, but that: a) depends on an implementation detail of a proprietary service, b) is a complicated transformation:
https://users.rust-lang.org/t/psa-please-use-absolute-urls-in-crate-readmes/45136
crates-io already has some code to fix relative URLs in READMEs, but it has to assume the README lives at the root of the repository, and that the main branch is called master
. It's very hard to do any better after the crate has been published.
The URLs are relative to the position of README inside the repository, but Cargo.toml
doesn't contain that information. In monorepos with multiple crates the README may not be in the repository root. Getting that path after a crate is published requires cloning and searching the repository. Before publishing, Cargo could simply check the local git checkout.
I see a few of ways to improve this situation:
-
When packaging/publishing, scan README markdown for relative URLs and warn users that relative URLs may not be resolved properly. Users should make these URLs absolute. To reduce noise, the warning could be shown only in non-trivial configuration that crates-io doesn't support (e.g. when the README isn't in the root of the repository).
-
Rewrite README markdown and change relative URLs to absolute automatically. When publishing, Cargo has access to the local checkout, from which it can learn the commit hash, in-repo paths, and repo URLs to do rewriting well. An unmodified README could be kept as README.orig, similarly how Cargo.toml is rewritten. URL schemes are dependent on code hosting provider, so Cargo would have to have knowledge of GitHub/GitLab/etc. URL schemes. This is hard to avoid: if not Cargo, then crates-io, lib.rs, libraries.io and other places will have to do this. Cargo could still warn about unknown code hosts. I already have a crate that does this rewriting. I could make it usable in Cargo.
-
Define and enforce that relative URLs in the README must only refer to files included inside the crate tarball (so can't use something like
../../assets/logo.png
), and make crates-io rewrite URLs to use files from crate tarball rather than github.com. That would be ideal in terms of independence from 3rd party proprietary services and longevity of crate tarballs. However, crates-io (and similar sites) would have to be careful to sanitize files from the crate to avoid XSS.