Skip to main content

Bug Tracker

Side navigation

#11617 closed feature (fixed)

Opened April 21, 2012 08:53PM UTC

Closed June 21, 2012 07:39PM UTC

Last modified June 26, 2012 06:14PM UTC

Define a $.parseHTML method for creating HTML fragments

Reported by: dmethvin Owned by: dmethvin
Priority: high Milestone: 1.8
Component: manipulation Version: 1.7.1
Keywords: Cc:
Blocked by: Blocking:

Currently, we try to sniff out HTML in whatever is passed to $(), leading to problems like #9521 where the developer sends untrusted input to jQuery. Any real fix to #9521 that plugs all the holes is likely to create situations where we reject HTML strings that we previously accepted.

By creating a $.html() method we can let the developer be explicit that they want to create a fragment from HTML and accept any consequences, rather than let $() guess it. Over the next few versions we could tighten down $() to say that any HTML string passed to it must begin and end with angle brackets--no spaces or text on the ends--which might allow us to avoid the regex check.

If we had this new method, what should it return? Seems like a jQuery object with the nodes would be the most obvious, but it could return a plain Array of nodes or a documentFragment with the nodes.

I'm also open to bikeshedding on the method name, since it's not an analog to $.fn.html so perhaps $.fragment, $.nodes or similar.

Attachments (0)
Change History (12)

Changed April 21, 2012 09:36PM UTC by mikesherov comment:1

Just chiming in here that this function would obviously not be a getter/setter, because any getter/setter combo that was strict would break round-tripping.

Changed April 21, 2012 09:53PM UTC by jaubourg comment:2

Why not $.parseHTML()? Would be on par with $.parseJSON() and $.parseXML() and be clear about obviously not being a getter.

+1 on the intent. I guess a fragment would be nice but a collection could do.

Changed April 23, 2012 04:34PM UTC by rwaldron comment:3

component: unfiledmanipulation
owner: → dmethvin
priority: undecidedlow
status: newassigned
type: bugfeature

Changed June 08, 2012 01:44AM UTC by dmethvin comment:4

milestone: None1.8
summary: Define a $.html method for creating HTML fragmentsDefine a $.parseHTML method for creating HTML fragments

Changed June 08, 2012 01:46AM UTC by scottgonzalez comment:5

Just for reference, this is currently being spec'd as Document.parse().

Changed June 19, 2012 04:06PM UTC by timmywil comment:6

This needs a little more discussion. I went ahead and implemented fixes for some things, but xss is still an issue. Copied from the meeting notes:

Want some way to control whether scripts run?

$.parseHTML(html, { allowScripts: true }); ?

distinguish allow inline vs. external?

Changed June 20, 2012 03:57PM UTC by timmywil comment:7

priority: lowhigh

Also see #11290

Changed June 21, 2012 07:39PM UTC by timmywil comment:8

resolution: → fixed
status: assignedclosed

Add parseHTML for explicitly parsing strings into html. Fixes #11617.

Changeset: e2497c682f26b7916d76cb2896c6fe621b376d82

Changed June 23, 2012 06:44PM UTC by muddydixon comment:9

I checked below snippets in latest git master (4df3aaeab3f5c1f54d7564fe9973f6bf35664265).

This XSS was not resolved. Why this bug was fixed?

Changed June 25, 2012 11:05PM UTC by comment:10

Hello! Just a quick comment, would be great if with an options object we can define which tags we allow.

Use case: rich editor in an iframe/div with contentEditable enabled, user paste some unknown html (word maybe) and we only want to allow the strong tag and remove or rename the others tag. This would be useful for many wyswyg html5 editors.

Changed June 25, 2012 11:29PM UTC by rwaldron comment:11 --

Sounds like a great use case for a plugin

Changed June 26, 2012 06:14PM UTC by dmethvin comment:12

To clarify the purpose of this method, it is beginning the process of separating the two String cases that $() handles: selectors and HTML serialization. There was some discussion in #11974 that implied it may be misunderstood as some sort of "XSS-proof HTML processing." **It is not.**

We want to mitigate the chances that a jQuery dev calls $(selector) thinking it is a CSS selector, but Mr. Bad Guy has managed to get script into selector and therefore executes it. We're doing that by providing $.parseHTML and eventually locking down the HTML recognition of $() to a small subset.

If/when environments provide better ways to sandbox, the ability of $.parseHTML() to make the dev's intentions clear will come in handy. However, no matter the method, jQuery allows devs to parse and execute scripts. If they process complex HTML with script and allow that script to come from untrusted or corruptible sources, it is still possible to make successful XSS attacks--just as it would be if the dev wrote their code in bare DOM methods.