Split escapeTextForBrowser into escapeTextContentForBrowser and quoteAttributeValueForBrowser #1599

syranide · 2014-05-25T10:43:01Z

IMHO the preferable solution to #1461, my comment from that PR:

Chrome only escapes <, > and & when setting textContent, it only escapes & and " when setting an attribute. Which I would say makes my suggestion above quite a lot less alien (to me at least).

It aligns with OWASPs stance that that only " can be used to break out of " for quoted attribute values.
It also aligns with OWASPs recommendation for text/html, ' and " is unnecessary because we're dealing with plain text and / is unnecessary because it's a precaution against there being an unescaped < before the injected content.
Attribute names cannot be escaped, invalid chars invalidate the entire attribute name. Since we already very strictly filter against invalid attribute names we don't need to do anything further for attribute names.

Attribute names: discard invalid
Attribute values: & + "
Text content: & + < + >

With these rules we generate the same HTML that browsers do, no extra clutter.

The only danger I see is if dangerouslySetInnerHTML is used with invalid HTML, if there's an unclosed quoted attribute then anyone can now add as many attributes as they want (an unclosed tag is not an issue though), whereas if we quote " they can only add more data to that attribute. But really, if what you're sending to dangerouslySetInnerHTML is not rigorously vetted (or at the very least valid HTML) you're knee deep in trouble regardless. The safest solution would be to not include dangerouslySetInnerHMTL in the initial markup at all, but to always set it with innerHTML.

Note that escapeTextForBrowser was renamed to escapeTextContentForBrowser, so any external uses of it will now be greeted with an error (instead of a potentially dangerous situation).

I also added a test that explicitly verifies the output of ReactDOMComponent against a manually and correctly escaped string.

PS. Even if you don't like these "minimal rules", the separation between escapeTextContentForBrowser and quoteAttributeValueForBrowser makes a lot of sense to me, this PR also does away with all the incorrect escaping of attribute names.

mathiasbynens · 2014-06-17T06:37:17Z

What about unquoted attribute values, though? Or is there no way to generate those using React?

sophiebits · 2014-06-17T06:40:13Z

There's no way to write arbitrary HTML (well, you can with the dangerouslySetInnerHTML property but then you're on your own); React builds all the HTML itself. Source code looks something like

render: function() {
  return <a href="hello">click me</a>;
}

which is statically transformed into

render: function() {
  return React.DOM.a({href: "hello"}, "click me");
}

which React then produces markup for using this code, so there's never an opportunity for an unquoted attribute value to be made.

mathiasbynens · 2014-06-17T06:46:31Z

Okay, just wanted to make sure. dangerouslySetInnerHTML sounds dangerous indeed :)

syranide · 2014-06-17T09:07:19Z

We can conditionally generate unquoted attribute values when it's guaranteed to be safe (i.e, for numbers and ASCII char sequences).

The only reason I can think of would be to save bytes, it's imaginable that it would (im)measurably improve performance as we could skip the overhead of the escaping function and generate measurably smaller HTML (which would improve innerHTML performance).

The one thing that speaks to me here is that for really well-written HTML you're not as likely to find attributes with whitespace or unsafe chars (mostly single classes and simple values). When data-reactid is gone, it means that removing unnecessary quotes might realistically yield in the vicinty of 5-10% less HTML in the best cases. However, as virtually everyone serves compressed HTML, the real world difference would probably be no more than 1-5% (in the best case).

It feels kind of dirty, but HTML5 makes no mention of depreciation of unquoted attribute values (http://www.w3.org/TR/html-markup/syntax.html#syntax-attr-unquoted) and as usual provide an exact implementation to follow.

Personally I'm torn between it feeling "dirty" and it being "the most efficient implementation". What's your take @yungsters ?

yungsters · 2014-06-17T18:00:17Z

I think stripping quotes can be something we look into after we nail text escaping first. I'm not too against "dirty" stuff if it's handled by a framework and done correctly. (I would also want to profile the impact of an additional check or pattern match for every attribute.)

A lot of the discussion here has been ensuring that we generate correct HTML (where correct is the specifications listed above plus existing browser behavior). I think as a framework that makes it easy to set attribute values or text content using user input, we have an obligation to also ensure that the generated HTML is safe and secure — not vulnerable to XSS.

syranide · 2014-06-17T20:24:25Z

@yungsters Interesting and I agree with everything you said. It should be a separate PR regardless and I wouldn't mind doing the necessary work for compiling a rough best/worst-case performance/size benchmark.

Also, I fully agree and understand your point about XSS and was honestly expecting more an opposition for this PR (for that reason).

syranide · 2014-07-11T08:03:44Z

@zpao @yungsters If you agree with the refactoring/corrections done by this PR, a stopgap solution is that I reintroduce all the current rules and you can merge it as-is. If/when your security team clears the reduced set of rules, then we just reduce the rules in the escapers. Thoughts?

syranide · 2014-09-18T17:19:51Z

@zpao @yungsters As I "discussed" in my previous comment, I have reverted this PR to use the current escaping rules instead. It now only removes the invalid escaping of attribute names and introduces a new quoteAttributeValueForBrowser (also still renames escapeTextForBrowser).

So it seems to me that there should be nothing controversial about this PR. It enables us to easily enable narrower escaping (or omitting quotes for simple values) in the future and improves the code in general.

Attaching the narrower, now reverted, escaping functions (for posterity):

var ESCAPE_LOOKUP = {
  '&': '&amp;',
  '>': '&gt;',
  '<': '&lt;'
};

var ESCAPE_REGEX = /[&><]/g;

function escaper(match) {
  return ESCAPE_LOOKUP[match];
}

/**
 * Escapes text content to prevent scripting attacks.
 *
 * @param {*} text Text value to escape.
 * @return {string} An escaped string.
 */
function escapeTextContentForBrowser(text) {
  return ('' + text).replace(ESCAPE_REGEX, escaper);
}

var ESCAPE_LOOKUP = {
  '&': '&amp;',
  '"': '&quot;'
};

var ESCAPE_REGEX = /[&"]/g;

function escaper(match) {
  return ESCAPE_LOOKUP[match];
}

/**
 * Escapes attribute value to prevent scripting attacks.
 *
 * @param {*} text Text value to escape.
 * @return {string} An escaped string.
 */
function quoteAttributeValueForBrowser(text) {
  return '"' + ('' + text).replace(ESCAPE_REGEX, escaper) + '"';
}

zpao · 2014-09-29T20:52:00Z

@yungsters for review (though he's out so I might need to reping him in 2 weeks)

syranide · 2015-02-02T20:47:28Z

ping @yungsters, this PR is currently limited to only "splitting escapeTextForBrowser into escapeTextContentForBrowser and quoteAttributeValueForBrowser" for making the code arguably neater and also fixing some related "mistakes" (like escaping attribute names). Any objections?

yungsters · 2015-02-03T04:18:56Z

src/browser/ui/ReactDOMComponent.js

-    invariant(
+    'Can only set one of `children` or `props.dangerouslySetInnerHTML`.'
+  );
+  invariant(


Can you revert the indentation changes here and below?

yungsters · 2015-02-03T04:21:04Z

Sorry for missing this. The changes look straightforward and reasonable to me. There are some extraneous changes included in the pull request. Can you remove those?

Otherwise, this looks good to me.

syranide · 2015-02-03T10:25:48Z

@yungsters Sorry about the indentation, was caused by the rebase and I missed it, fixed. Which extraneous changes are you referring to, removal of processAttributeNameAndPrefix? I agree that it might make sense to put it in its own PR (but I kind of also think it makes sense to include it as it's affected, w/e :)), but are there any others?

EDIT: Ah, perhaps you're referring to the removal of (incorrectly) escaped attribute names too? It's kind of weird to do escapeTextContentForBrowser(name) + '=' + quoteAttributeValueForBrowser(value) as it's obviously incorrect from the naming of the functions... do you prefer to do it in a separate PR anyway? (It makes sense from a commit/history perspective)

syranide · 2015-02-03T15:40:57Z

@yungsters I removed all "extraneous changes" that I could find, give me thumbs up and I'll merge this in and put up a separate PR for those changes/fixes.

yungsters · 2015-02-03T22:13:12Z

src/utils/quoteAttributeValueForBrowser.js

+
+"use strict";
+
+var escapeTextContentForBrowser = require('./escapeTextContentForBrowser');


@zpao, should this just be require('escapeTextContentForBrowser')?

Aaaah shit, ofc it should be.

syranide · 2015-02-03T22:17:30Z

Note to self, need to rebase and update another added escapeTextForBrowser.

…AttributeValueForBrowser

syranide · 2015-02-04T12:46:26Z

My bad, I've rebased, fixed the require and license.

yungsters · 2015-02-04T16:00:13Z

Looks good to me.

Split escapeTextForBrowser into escapeTextContentForBrowser and quoteAttributeValueForBrowser

sophiebits mentioned this pull request Jun 17, 2014

Don't escape slash; it's unnecessary #1461

Merged

syranide mentioned this pull request Jul 18, 2014

Newlines handled incorrectly by innerText in IE8 #1864

Merged

zpao added the GH Review: review-needed label Sep 29, 2014

yungsters reviewed Feb 3, 2015
View reviewed changes

yungsters added GH Review: accepted and removed GH Review: review-needed labels Feb 3, 2015

yungsters reviewed Feb 3, 2015
View reviewed changes

yungsters added GH Review: needs-revision and removed GH Review: accepted labels Feb 3, 2015

Split escapeTextForBrowser into escapeTextContentForBrowser and quote…

8ca058a

…AttributeValueForBrowser

syranide added GH Review: review-needed and removed GH Review: needs-revision labels Feb 4, 2015

yungsters removed the GH Review: review-needed label Feb 4, 2015

yungsters added the GH Review: accepted label Feb 4, 2015

syranide added a commit that referenced this pull request Feb 5, 2015

Merge pull request #1599 from syranide/escbrow

04e6d02

Split escapeTextForBrowser into escapeTextContentForBrowser and quoteAttributeValueForBrowser

syranide merged commit 04e6d02 into facebook:master Feb 5, 2015

syranide deleted the escbrow branch February 5, 2015 08:41

syranide mentioned this pull request Feb 5, 2015

Drop processAttributeNameAndPrefix and invalid attribute name escaping #3047

Merged

heshamelmasry77 mentioned this pull request Jul 31, 2022

[Snyk] Security upgrade node-fetch from 1.6.3 to 3.2.10 heshamelmasry77/react#71

Closed


		"use strict";

		var escapeTextContentForBrowser = require('./escapeTextContentForBrowser');

Split escapeTextForBrowser into escapeTextContentForBrowser and quoteAttributeValueForBrowser #1599

Split escapeTextForBrowser into escapeTextContentForBrowser and quoteAttributeValueForBrowser #1599

Uh oh!

Conversation

syranide commented May 25, 2014

Uh oh!

mathiasbynens commented Jun 17, 2014

Uh oh!

sophiebits commented Jun 17, 2014

Uh oh!

mathiasbynens commented Jun 17, 2014

Uh oh!

syranide commented Jun 17, 2014

Uh oh!

yungsters commented Jun 17, 2014

Uh oh!

syranide commented Jun 17, 2014

Uh oh!

syranide commented Jul 11, 2014

Uh oh!

syranide commented Sep 18, 2014

Uh oh!

zpao commented Sep 29, 2014

Uh oh!

syranide commented Feb 2, 2015

Uh oh!

yungsters Feb 3, 2015

Choose a reason for hiding this comment

Uh oh!

yungsters commented Feb 3, 2015

Uh oh!

syranide commented Feb 3, 2015

Uh oh!

syranide commented Feb 3, 2015

Uh oh!

yungsters Feb 3, 2015

Choose a reason for hiding this comment

Uh oh!

zpao Feb 3, 2015

Choose a reason for hiding this comment

Uh oh!

syranide Feb 3, 2015

Choose a reason for hiding this comment

Uh oh!

syranide commented Feb 3, 2015

Uh oh!

syranide commented Feb 4, 2015

Uh oh!

yungsters commented Feb 4, 2015

Uh oh!

Uh oh!