-
Notifications
You must be signed in to change notification settings - Fork 112
IOError when scanning file with odd chars #98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Is it still a problem? I tinkered around a lot in this part of the scanners…also, which Ruby version do you use? |
Sorry, I can't relly replicate the problem. Can you send me the problematic input file? |
No answer, delaying this… |
Hi Kornelius, Sorry for my delayed response. I no longer have the problem file. I tried to re-create it, but could not. I got a different issue, but it is much less serious. Here is my test script. Note the inverted quotes in the comment.
Output:
It does not appear to have URL encoded the quotes properly. But at least it did not crash. The problem originally occurred when I was probably using JRuby 1.6.?. Now I am using JRuby 1.7.4. I would not be surprised if some string encoding issues were part of the original problem, and perhaps part of this problem also. Hope that helps... Rob |
The main question would be: what encoding are you using for that file? It works for me, but I'm using UTF-8 and Ruby 2.0… CodeRay uses UTF-8. You should probably convert the input before you send it to the scanner. |
Hi, UTF-8, Jruby 1.7.4. May be a jruby issue? |
Possible, wouldn't be the first time. Can you send me the file? (murphy rubychan de) If we can produce a CodeRay-independent minimal failing test case that works on MRI, then we can file a bug report. |
I inadvertently pasted in to a ruby comment some text from Word which had inverted commas. When I executed
CodeRay.scan_file
on that file, it complained with:...which was thrown at
lib/coderay/scanner.rb:120
(methodguess_encoding
). Further up the stack innormalize
I could see where it was branching toencode_with_encoding
(as opposed toto_unix
) so I commented that out to force it to useto_unix
.Then I retried and received this error:
...which helped my diagnose the root problem.
If would be good if there was some error handling around the
IO.popen
call to help diagnose, or if the call toguess_encoding
was stricter (assuming it was called in error). Not sure how to do this but thought I'd log it here anyway in case someone else has the same error...Windows XP - Notepad ++ - ANSI file
The text was updated successfully, but these errors were encountered: