Bug Tracker

Opened 11 years ago

Closed 11 years ago

Last modified 11 years ago

#11617 closed feature (fixed)

Define a $.parseHTML method for creating HTML fragments

Reported by: dmethvin Owned by: dmethvin
Priority: high Milestone: 1.8
Component: manipulation Version: 1.7.1
Keywords: Cc:
Blocked by: Blocking:


Currently, we try to sniff out HTML in whatever is passed to $(), leading to problems like #9521 where the developer sends untrusted input to jQuery. Any real fix to #9521 that plugs all the holes is likely to create situations where we reject HTML strings that we previously accepted.

By creating a $.html() method we can let the developer be explicit that they want to create a fragment from HTML and accept any consequences, rather than let $() guess it. Over the next few versions we could tighten down $() to say that any HTML string passed to it must begin and end with angle brackets--no spaces or text on the ends--which might allow us to avoid the regex check.

If we had this new method, what should it return? Seems like a jQuery object with the nodes would be the most obvious, but it could return a plain Array of nodes or a documentFragment with the nodes.

I'm also open to bikeshedding on the method name, since it's not an analog to $.fn.html so perhaps $.fragment, $.nodes or similar.

Change History (12)

comment:1 Changed 11 years ago by mikesherov

Just chiming in here that this function would obviously not be a getter/setter, because any getter/setter combo that was strict would break round-tripping.

comment:2 Changed 11 years ago by jaubourg

Why not $.parseHTML()? Would be on par with $.parseJSON() and $.parseXML() and be clear about obviously not being a getter.

+1 on the intent. I guess a fragment would be nice but a collection could do.

comment:3 Changed 11 years ago by Rick Waldron

Component: unfiledmanipulation
Owner: set to dmethvin
Priority: undecidedlow
Status: newassigned
Type: bugfeature

comment:4 Changed 11 years ago by dmethvin

Milestone: None1.8
Summary: Define a $.html method for creating HTML fragmentsDefine a $.parseHTML method for creating HTML fragments

comment:5 Changed 11 years ago by scottgonzalez

Just for reference, this is currently being spec'd as Document.parse().

comment:6 Changed 11 years ago by Timmy Willison

This needs a little more discussion. I went ahead and implemented fixes for some things, but xss is still an issue. Copied from the meeting notes:

Want some way to control whether scripts run?

$.parseHTML(html, { allowScripts: true }); ? distinguish allow inline vs. external?

comment:7 Changed 11 years ago by Timmy Willison

Priority: lowhigh

Also see #11290

comment:8 Changed 11 years ago by Timmy Willison

Resolution: fixed
Status: assignedclosed

Add parseHTML for explicitly parsing strings into html. Fixes #11617.

Changeset: e2497c682f26b7916d76cb2896c6fe621b376d82

comment:9 Changed 11 years ago by muddydixon

I checked below snippets in latest git master (4df3aaeab3f5c1f54d7564fe9973f6bf35664265). This XSS was not resolved. Why this bug was fixed?


comment:10 Changed 11 years ago by christianmusa@…

Hello! Just a quick comment, would be great if with an options object we can define which tags we allow.

Use case: rich editor in an iframe/div with contentEditable enabled, user paste some unknown html (word maybe) and we only want to allow the strong tag and remove or rename the others tag. This would be useful for many wyswyg html5 editors.

comment:11 Changed 11 years ago by Rick Waldron

christianmusa@… --

Sounds like a great use case for a plugin

comment:12 Changed 11 years ago by dmethvin

To clarify the purpose of this method, it is beginning the process of separating the two String cases that $() handles: selectors and HTML serialization. There was some discussion in #11974 that implied it may be misunderstood as some sort of "XSS-proof HTML processing." It is not.

We want to mitigate the chances that a jQuery dev calls $(selector) thinking it is a CSS selector, but Mr. Bad Guy has managed to get script into selector and therefore executes it. We're doing that by providing $.parseHTML and eventually locking down the HTML recognition of $() to a small subset.

If/when environments provide better ways to sandbox, the ability of $.parseHTML() to make the dev's intentions clear will come in handy. However, no matter the method, jQuery allows devs to parse and execute scripts. If they process complex HTML with script and allow that script to come from untrusted or corruptible sources, it is still possible to make successful XSS attacks--just as it would be if the dev wrote their code in bare DOM methods.

Note: See TracTickets for help on using tickets.