Skip to content

Add a linkchecker #356

Closed
Closed
@projektir

Description

@projektir
Contributor

A linkchecker is a convenient tool to have for avoiding errors and keeping track of dead links. It is also a nice to have for #308.

The RBE linkchecker and the one used in rust-lang/rust both scan the files line-by-line while applying a regex to find links and check them. I don't know if it'd be OK for us to adopt the rust-lang/rust one? @steveklabnik

I think a good place for this would be mdbook test.

Activity

Michael-F-Bryan

Michael-F-Bryan commented on Jun 24, 2017

@Michael-F-Bryan
Contributor

This would be a good candidate for the plugin system (#163). Ideally, after the rendering stage you'd be able to make a plugin which gets passed the rendered output's location and then checks all the links in all the *.html files it can find.

We're planning on refactoring the current system to make it a lot easier to write your own plugins and renderers.

azerupi

azerupi commented on Jun 24, 2017

@azerupi
Contributor

I think a good place for this would be mdbook test

Definitely in mdbook test

This would be a good candidate for the plugin system

I agree, this could potentially be written as a plugin in the future :)
I emphasised "in the future" because I don't want to stall progress on changes that are coming soon-ish. We don't have a deadline for the plugin system, so if someone wants to contribute a solution right now, I wouldn't want to break their inertia.

However, we can keep this use case in the back of our minds when doing the refactorings, to make it indeed possible to implement this as a plugin later. :)

added
A-HTMLArea: HTML Rendering
A-TestsArea: `mbdook test` related tests
C-enhancementCategory: Enhancement or feature request
on Jun 24, 2017
steveklabnik

steveklabnik commented on Jun 24, 2017

@steveklabnik
Member
budziq

budziq commented on Jun 24, 2017

@budziq
Contributor

Definitely in mdbook test

It might be nice to have it as a warning also on mdbook build stage

I've thought about trying to put the rust-lang one on crates.io so others
could use it too, to be honest.

@steveklabnik That would be awesome!

added this to the Plug-ins milestone on Jan 3, 2018
Michael-F-Bryan

Michael-F-Bryan commented on Jan 13, 2018

@Michael-F-Bryan
Contributor

For anyone interested, I've started playing around with a mdbook-linkcheck backend for checking links. You'll need to install mdbook directly from master and isn't 100% finished yet, but it may be useful for some people.

EDIT: It looks like the tool works, because I've already found my first batch of broken links, rust-lang/rust-by-example#990 🎉

Example Output

This is the output (when logging very verbosely) when the tool is run over mdbook's user: guide

$ RUST_LOG=mdbook_linkcheck cargo run -- -s ~/Documents/forks/mdBook/book-example
    Finished dev [unoptimized + debuginfo] target(s) in 0.0 secs
     Running `target/debug/mdbook-linkcheck -s /home/michael/Documents/forks/mdBook/book-example`
 INFO 2018-01-13T13:38:49Z: mdbook_linkcheck: Checking for broken links
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck: Config {
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck:     follow_web_links: false
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck: }
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck: Finding all links
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "http://www.rust-lang.org" in README.md#3
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://github.com/rust-lang-nursery/mdBook" in README.md#7
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://github.com/rust-lang-nursery/mdBook/issues" in README.md#7
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://docs.rs/mdbook/*/mdbook/" in README.md#11
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://www.mozilla.org/MPL/2.0/" in README.md#15
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://crates.io/crates/mdbook" in cli/cli-tool.md#3
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://www.rust-lang.org/" in cli/cli-tool.md#10
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://www.rust-lang.org/downloads.html" in cli/cli-tool.md#10
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://crates.io/" in cli/cli-tool.md#20
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://github.com/rust-lang-nursery/mdBook" in cli/cli-tool.md#27
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "format/summary.html" in cli/init.md#25
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://github.com/rust-lang-nursery/mdBook/issues" in cli/watch.md#26
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://github.com/rust-lang-nursery/mdBook/issues" in cli/serve.md#40
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://doc.rust-lang.org/stable/book/" in cli/test.md#3
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "http://handlebarsjs.com/" in format/theme/theme.md#3
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://github.com/rust-lang-nursery/mdBook/issues" in format/theme/index-hbs.md#90
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://highlightjs.org" in format/theme/syntax-highlighting.md#3
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://github.com/rust-lang-nursery/mdBook/issues" in format/theme/syntax-highlighting.md#56
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://www.mathjax.org/" in format/mathjax.md#3
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "" in format/rust.md#38
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://docs.rs/mdbook" in lib/index.md#11
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://docs.rs/mdbook/*/mdbook/renderer/struct.RenderContext.html" in lib/index.md#33
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "http://www.linfo.org/rule_of_silence.html" in lib/index.md#165
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://github.com/mdinger" in misc/contributors.md#7
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://github.com/kbknapp" in misc/contributors.md#8
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://github.com/steveklabnik" in misc/contributors.md#9
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://github.com/asolove" in misc/contributors.md#10
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://github.com/waynenilsen" in misc/contributors.md#11
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://github.com/funkill" in misc/contributors.md#12
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://github.com/FuGangqiang" in misc/contributors.md#13
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://github.com/Michael-F-Bryan" in misc/contributors.md#14
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::links: Found "https://github.com/cspiegel" in misc/contributors.md#15
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck: Found 32 links
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "http://www.rust-lang.org" in README.md#3
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "http://www.rust-lang.org/"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://github.com/rust-lang-nursery/mdBook" in README.md#7
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://github.com/rust-lang-nursery/mdBook"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://github.com/rust-lang-nursery/mdBook/issues" in README.md#7
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://github.com/rust-lang-nursery/mdBook/issues"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://docs.rs/mdbook/*/mdbook/" in README.md#11
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://docs.rs/mdbook/*/mdbook/"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://www.mozilla.org/MPL/2.0/" in README.md#15
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://www.mozilla.org/MPL/2.0/"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://crates.io/crates/mdbook" in cli/cli-tool.md#3
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://crates.io/crates/mdbook"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://www.rust-lang.org/" in cli/cli-tool.md#10
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://www.rust-lang.org/"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://www.rust-lang.org/downloads.html" in cli/cli-tool.md#10
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://www.rust-lang.org/downloads.html"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://crates.io/" in cli/cli-tool.md#20
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://crates.io/"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://github.com/rust-lang-nursery/mdBook" in cli/cli-tool.md#27
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://github.com/rust-lang-nursery/mdBook"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "format/summary.html" in cli/init.md#25
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Searching for /home/michael/Documents/forks/mdBook/book-example/src/format/summary.md
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://github.com/rust-lang-nursery/mdBook/issues" in cli/watch.md#26
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://github.com/rust-lang-nursery/mdBook/issues"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://github.com/rust-lang-nursery/mdBook/issues" in cli/serve.md#40
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://github.com/rust-lang-nursery/mdBook/issues"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://doc.rust-lang.org/stable/book/" in cli/test.md#3
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://doc.rust-lang.org/stable/book/"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "http://handlebarsjs.com/" in format/theme/theme.md#3
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "http://handlebarsjs.com/"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://github.com/rust-lang-nursery/mdBook/issues" in format/theme/index-hbs.md#90
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://github.com/rust-lang-nursery/mdBook/issues"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://highlightjs.org" in format/theme/syntax-highlighting.md#3
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://highlightjs.org/"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://github.com/rust-lang-nursery/mdBook/issues" in format/theme/syntax-highlighting.md#56
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://github.com/rust-lang-nursery/mdBook/issues"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://www.mathjax.org/" in format/mathjax.md#3
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://www.mathjax.org/"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "" in format/rust.md#38
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck: Error for "" in format/rust.md#38, The link is empty
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://docs.rs/mdbook" in lib/index.md#11
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://docs.rs/mdbook"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://docs.rs/mdbook/*/mdbook/renderer/struct.RenderContext.html" in lib/index.md#33
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://docs.rs/mdbook/*/mdbook/renderer/struct.RenderContext.html"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "http://www.linfo.org/rule_of_silence.html" in lib/index.md#165
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "http://www.linfo.org/rule_of_silence.html"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://github.com/mdinger" in misc/contributors.md#7
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://github.com/mdinger"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://github.com/kbknapp" in misc/contributors.md#8
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://github.com/kbknapp"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://github.com/steveklabnik" in misc/contributors.md#9
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://github.com/steveklabnik"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://github.com/asolove" in misc/contributors.md#10
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://github.com/asolove"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://github.com/waynenilsen" in misc/contributors.md#11
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://github.com/waynenilsen"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://github.com/funkill" in misc/contributors.md#12
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://github.com/funkill"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://github.com/FuGangqiang" in misc/contributors.md#13
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://github.com/FuGangqiang"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://github.com/Michael-F-Bryan" in misc/contributors.md#14
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://github.com/Michael-F-Bryan"
TRACE 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Checking "https://github.com/cspiegel" in misc/contributors.md#15
DEBUG 2018-01-13T13:38:49Z: mdbook_linkcheck::validation: Ignoring "https://github.com/cspiegel"
There were 1 broken links

format/rust.md#38: The link is empty

projektir

projektir commented on Jan 13, 2018

@projektir
ContributorAuthor

@Michael-F-Bryan so rust-lang/rust already has a linkchecker, which is the one we originally wanted to pull out and turn into a crate (I'm not sure what that means for plugins). It has some problems, though, that yours doesn't have (for instance, this fix is really needed), but it also does some things yours doesn't (check for absolute paths).

Idk if we want to have these out of sync given that rust-lang/rust's linkchecker would run on every x.py build for all the books that it manages.

Michael-F-Bryan

Michael-F-Bryan commented on Jan 13, 2018

@Michael-F-Bryan
Contributor

Idk if we want to have these out of sync given that rust-lang/rust's linkchecker would run on every x.py build for all the books that it manages.

My original hopes were that this could supplement (or even succeed?) their link checker, although on further inspection they do a lot of cross-site linking (i.e. using links to files outside the book such as ../../std/prelude/index.html). My linkchecker works purely with the source book and doesn't take into account the fact that other things exist on the Rust S3 bucket, so I don't know whether this is still possible.

That said, the entire idea behind enabling alternate backends is that people can write their own tools to suit their exact use case. For example, it was almost trivial to knock up a backend which runs everything through rust-skeptic, which is something Rust By Example currently need to do manually.

but it also does some things yours doesn't (check for absolute paths).

This part was tricky. I originally treated relative and absolute paths separately (relative links are relative to the chapter's directory, absolute is relative to src/) but found that most of the links in Rust By Example used a completely different convention. We use the <base> tag to tweak how links get resolved by your browser, so what I detected as a "broken link" turned out to still work fine when viewing online.

9 remaining items

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-CLIArea: CLIA-HTMLArea: HTML RenderingA-LinksArea: Issues with linksA-TestsArea: `mbdook test` related testsC-enhancementCategory: Enhancement or feature requestS-WishlistStatus: Wishlist

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @steveklabnik@ehuss@budziq@azerupi@projektir

        Issue actions

          Add a linkchecker · Issue #356 · rust-lang/mdBook