Skip to content

Markdown parser doesn't ignore leading whitespace in list items #13789

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
stevengj opened this issue Oct 27, 2015 · 6 comments · Fixed by #13835
Closed

Markdown parser doesn't ignore leading whitespace in list items #13789

stevengj opened this issue Oct 27, 2015 · 6 comments · Fixed by #13835
Labels
docsystem The documentation building system

Comments

@stevengj
Copy link
Member

julia> Markdown.parse("""
       * A simple list
         split over
         three lines
       """)
    •  A simple list   split over   three lines

As I understand it, indentation matching the initial list item should be ignored, rather than converted into multiple spaces. e.g. here is how Github renders the same list:

  • A simple list
    split over
    three lines

In fact, Github seems to ignore any amount of leading whitespace in the list items. On the other hand, Jupyter renders a code block if you indent by an extra four spaces beyond the W+1 spaces for a list marker of width W plus one space:

* A simple list
  split over
  three lines

* A simple list
      with code

is rendered as:
image

As I understand the commonmark spec, Jupyter's behavior seems the more correct one:

@stevengj stevengj added the docsystem The documentation building system label Oct 27, 2015
@stevengj
Copy link
Member Author

(I noticed this in #13780.)

@hayd
Copy link
Member

hayd commented Oct 29, 2015

Part of the issue is that html ignores multiple whitespaces. IMO we should do the same when rendering markdown in the repl.

Note: You can compare markdown implementations with babelmark2.

Indentation immediately after a list never becomes code in commonmark., so potentially that's a Jupyter bug.

@stevengj
Copy link
Member Author

@hayd, I agree that when rendering non-code cells, we should ignore multiple whitespace. That would fix this issue.

@hayd
Copy link
Member

hayd commented Oct 29, 2015

I think this can be patched by regex replacing multiple whitespace here:

function terminline(io::IO, md::AbstractString)
    print(io, replace(md, r"[\s\t\n]+", " "))
end

This feels a little hacky, so let's ping @one-more-minute :)

@hayd
Copy link
Member

hayd commented Oct 31, 2015

Pushed a PR to fix as above.

One weird thing (unrelated to my PR, but similar to the example above) is that quote has different behaviour:

julia> Markdown.parse("- a\n b")
    •  a b

julia> Markdown.parse("> a\n b")
  |  a

  b

i.e. the next line doesn't count as part of the quote (it should).

@hayd
Copy link
Member

hayd commented Nov 2, 2015

I was looking at the quote part the other day, it seems that quote parse needs to know which characters should start a fresh block... I can't see a way around hardcoding what line starts can escape quote.

Related: I think we ought not allow • instead of - to construct a list, it's not markdown.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docsystem The documentation building system
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants