-
Notifications
You must be signed in to change notification settings - Fork 6.1k
Re-evaluate unicode tricks in comments #10254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account

Comments
Here are the lists of different control blocks: https://en.wikipedia.org/wiki/Unicode_control_characters Perhaps we should disallow control blocks and add some explicit exceptions for the ones we want to allow. As for RTL it possibly is something we want to support. |
Interesting site: https://www.compart.com/en/unicode/bidiclass
There are some "Isolation" marks that seem to be stackable. I'm not sure yet we should take care about those. |
I assume people have seen the relation to this, but thought I'd re-post it here to emphasize this issue's importance: https://github.com/ethereum/solidity-underhanded-contest/tree/master/submissions_2020/submission11_RobertMCForster |
This was fixed with #10326, wasn't it? |
Rust (CVE-2021-42574) and Go (issue #20209) is also affected. |
Would be nice to get confirmation! |
#10326 added checks against LRO/RLO/PDF but Unicode is a large beast and I'm sure there are other edge cases that can be used maliciously. The issue description/title sounds like it was meant to cover it in general, not just this specific case. |
Hmmm, it sounds like the checks added in #10326 are focused on unbalanced codepoints, which I interpret as similar to imbalanced brackets (ie. FWIW, the attack shown in the Rust release doesn't look to be out of balance. Either way, given the prominence of the announcement today, it'd be very helpful to get a statement from the Solidity team clarifying the status of compiler WRT to this issue. |
Hi, i link here a poc of CVE-2021-42574 for solidity and solc https://github.com/tin-z/solidity_CVE-2021-42574-POC |
Yes. That was the case used in the underhanded contest. Banning unbalanced checks looked like it was enough because it prevented mixing the content of strings/comments with the outside text. I see that the Rust example uses a slightly different trick - with the isolate characters LRI/RLI. Looks they make possible to just take out a bit of text out of the comment/string and render it at the beginning/end of the line. I think we should disallow these. Created a separate issue covering it: #13936.
Yeah, not really sure how far we should take it. Maybe we should just disallow all the override/embed chars like Rust did. What's your opinion? This should not trick the highlighter so the problem should be easily discovered but I guess you could say the same about the trick from the underhanded contest. |
This issue has been marked as stale due to inactivity for the last 90 days. |
Hi everyone! This issue has been automatically closed due to inactivity. |

We should re-evaluate which unicode characters we disallow in comments and strings. Especially RTL markers come to mind.
The text was updated successfully, but these errors were encountered: