Skip to content

Document language inspection is flaky #32

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
php-coder opened this issue Apr 15, 2017 · 16 comments
Closed

Document language inspection is flaky #32

php-coder opened this issue Apr 15, 2017 · 16 comments

Comments

@php-coder
Copy link

php-coder commented Apr 15, 2017

I see that 0.27 version has started to report the following warnings:

WARNING:html5validator.validator:"file:/home/travis/build/php-coder/mystamps/src/main/webapp/WEB-INF/views/error/403.html":1.16-5.73: info warning: This document appears to be written in Danish but the "html" start tag has "lang="en"". Consider using "lang="da"" (or variant) instead.
"file:/home/travis/build/php-coder/mystamps/src/main/webapp/WEB-INF/views/error/500.html":1.16-5.73: info warning: This document appears to be written in Danish but the "html" start tag has "lang="en"". Consider using "lang="da"" (or variant) instead.

Here are these files:

@php-coder
Copy link
Author

I forgot to mention how I'm executing html5validator:

html5validator \
	--root src/main/webapp/WEB-INF/views \
	--ignore-re 'Attribute “(th|sec|togglz|xmlns):[a-z]+” not allowed' \
		'Attribute “(th|sec|togglz):[a-z]+” is not serializable' \
		'Attribute with the local name “xmlns:[a-z]+” is not serializable' \
		'An "img" element must have an "alt" attribute' \
		'The first child "option" element of a "select" element with a "required" attribute' \
	--show-warnings

@php-coder
Copy link
Author

I couldn't reproduce it locally under MacOS. But it 100% reproducible in TravisCi.

@php-coder
Copy link
Author

Could be related to validator/validator#493

php-coder added a commit to php-coder/mystamps that referenced this issue Apr 16, 2017
php-coder added a commit to php-coder/mystamps that referenced this issue Apr 16, 2017
…document language.

After a09a3f5 commit it has started to fail with another error on
another file:

WARNING:html5validator.validator:"file:/home/travis/build/php-coder/mystamps/src/main/webapp/WEB-INF/views/category/info.html":1.16-5.71:
info warning: This document appears to be written in Lithuanian but the "html" start tag has "lang="en"".
Consider using "lang="lt"" (or variant) instead.

Work around for svenkreiss/html5validator#32
@php-coder
Copy link
Author

It reproduces on Linux.

@php-coder
Copy link
Author

@sideshowbarker Could you look at it? I suspect that it's not html5validator problem but rather validator related.

@svenkreiss
Copy link
Owner

@php-coder Can you link to a Travis build where this error occurred?

@php-coder
Copy link
Author

bodom91 pushed a commit to bodom91/mystamps that referenced this issue Aug 2, 2017
bodom91 pushed a commit to bodom91/mystamps that referenced this issue Aug 2, 2017
…document language.

After a09a3f5 commit it has started to fail with another error on
another file:

WARNING:html5validator.validator:"file:/home/travis/build/php-coder/mystamps/src/main/webapp/WEB-INF/views/category/info.html":1.16-5.71:
info warning: This document appears to be written in Lithuanian but the "html" start tag has "lang="en"".
Consider using "lang="lt"" (or variant) instead.

Work around for svenkreiss/html5validator#32
@php-coder
Copy link
Author

@sideshowbarker @svenkreiss Hi, this issue has started to appear again time to time, but now it says that the document in French.

Could you suggest me a way of debugging it? Is enabling verbose mode would help? Where the sources of this check, so I can read it/play with the code? Thanks!

@php-coder php-coder changed the title 0.27 regression: false positive document language inspection Document language inspection is flaky Dec 29, 2017
@sideshowbarker
Copy link

My only suggestion as far as debugging is to see if you can reproduce it with the vnu.jar directly or with https://checker.html5.org/ or https://validator.w3.org/nu/

@svenkreiss
Copy link
Owner

I agree, you will have to look into the Java part for debugging this.

Also just had a look at your 403.html file. It actually is a template file and not pure HTML. I wouldn't be surprised if the special characters trip up the language detection.

@sideshowbarker
Copy link

Also just had a look at your 403.html file. It actually is a template file and not pure HTML. I wouldn't be surprised if the special characters trip up the language detection.

Yeah, if that’s the case, then all bets are off as far as the HTML checker backend behavior goes — and not just specifically for language detection. The checker isn’t a tool for checking pre-parsed PHP or template content. It’s intended for checking the HTML contents as they would be sent over the wire.

@php-coder
Copy link
Author

It actually is a template file and not pure HTML.

It should be valid HTML that have a bunch of non-standard th:* attributes.

@svenkreiss Could you add an option to expose --no-langdetect to html5validator then?

@svenkreiss
Copy link
Owner

svenkreiss commented Jan 4, 2018 via email

@svenkreiss
Copy link
Owner

Version 0.2.10 is now on pypi that has this command line option. Does that help?

@php-coder
Copy link
Author

Thanks! Looks like it worked!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants