-
Notifications
You must be signed in to change notification settings - Fork 520
Add table of contents support #980
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Rebased on top of the latest master, ready for review. |
/assign @hoegaarden @hasheddan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/hold for the assigners to review as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @saschagrunert! One small comment but looks good otherwise :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All just nits, except the error hiding and the thing with the backtick in the title.
pkg/notes/toc.go
Outdated
} | ||
|
||
func add(result io.StringWriter, title string, indent int, headers map[string]int) { | ||
toReplace := strings.NewReplacer( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
- Technically, I believe we'd need to reimplement https://github.com/jch/html-pipeline/blob/master/lib/html/pipeline/toc_filter.rb#L41-L56 to be correct here. But I guess we are close enough? 🤷♂
toReplace
is a bit of a misleading name. If we are happy not to fully / correctly implement the generation of the anchor names, we could use something like:link := strings.NewReplacer( "!", "", "#", "", // all the other replacements "~", "", " ", "-", // move the replacement of the space also to the Replacer ).Replace(strings.ToLower(title))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it as suggested. I'm not sure which use-case we do not cover with the current implementation. Is there anything we missed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E.g. multiple successive space should be replaced to one dash, I am not sure if all the chars we remove with the replacer are equal to /[^\w\- ]/
, we don't do the equivalent of EscapeUtils.escape_html
and maybe other (I just skimmed both this code and the github html pipeline code) ... but as said, I think we are close enough for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, yeah sounds reasonable to me.
pkg/notes/toc.go
Outdated
if strings.Contains(prevLine, "`") { | ||
continue | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what that is about, TBH. Wouldn't that skip the 2nd of two successive headings, if the 1st one had a backtick in it's title?
# This has `some backticks`
## And this heading will be skipped now?
There is no test case for that either so I am not sure about the use-case / intention for that special case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, wouldn't it break if we'd had a header line with a backtick in it?
- we find a header and set
prevLine
to it's content, coincidentally it contains a backtick - next time around the if clause matches and we
continue
,prevLine
still is set to the thing with the backtick - next time around the if clause matches and we
continue
,prevLine
still is set to the thing with the backtick - ...
- eventually we are done scanning, but never really processed much except up until the first header with a backtick
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main purpose of this is to skip code blocks (do not generate any headers inside them, because they could contain #
. :) I think we have to somehow verify if we're inside a code block by the amount of backticks we've seen. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracking the right amount of back ticks should fix the code block issue in a more general way, PTAL :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is working correctly. The current implementation would ignore a perfectly valid heading
# `markdown` solves all our problems, they said
Other issues might arise with documents like
something
``````
wow ... this actually seems to be valid md
# ````go
# hello := `hello`
# ````
``````
something else
where the number of seen backticks is even in every line. This example is contrived, however we are dealing with random user input here.
Honest questions:
- How much effort do we want to spend here for correctly rendering the TOC?
- How likely is it that a b0rken TOC blanks out or scrambles the rest of the document?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first example should be covered by the current implementation, I verified that with an additional unit test:
Lines 148 to 152 in 5f80175
func TestGenerateTOCBackTickInHeading(t *testing.T) { | |
toc, err := GenerateTOC("# `markdown` solves all our problems, they said") | |
require.Nil(t, err) | |
require.Equal(t, toc, "- [`markdown` solves all our problems, they said](#markdown-solves-all-our-problems-they-said)\n") | |
} |
The second example is tricky, right. So what do you prefer? I could remove the overall code block check vs leaving it as it currently is.
pkg/notes/toc.go
Outdated
|
||
var headerPattern = regexp.MustCompile("^(?P<indent>#+) ?(?P<title>.+)$") | ||
|
||
func GenerateTOC(input string) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
I feel weird implementing a (partial) Markdown parser and generator, but I checked really quick and I could not find a give-me-a-toc-for-this-markdown-doc do module ... So I guess we gotta do what we gotta do 😄
This adds the possibility to add a generated table of contents before the actual markdown, which will be later used for the correct assembling of the overall `CHANGELOG-x.y.md` files. Signed-off-by: Sascha Grunert <[email protected]>
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cpanato, hoegaarden, saschagrunert The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/unhold |
This adds the possibility to add a generated table of contents before
the actual markdown, which will be later used for the correct assembling
of the overall
CHANGELOG-x.y.md
files.Needs #978