Bug Tracker

Ticket #11617 (closed feature: fixed)

Opened 2 years ago

Last modified 2 years ago

Define a $.parseHTML method for creating HTML fragments

Reported by: dmethvin Owned by: dmethvin
Priority: high Milestone: 1.8
Component: manipulation Version: 1.7.1
Keywords: Cc:
Blocking: Blocked by:

Description

Currently, we try to sniff out HTML in whatever is passed to $(), leading to problems like #9521 where the developer sends untrusted input to jQuery. Any real fix to #9521 that plugs all the holes is likely to create situations where we reject HTML strings that we previously accepted.

By creating a $.html() method we can let the developer be explicit that they want to create a fragment from HTML and accept any consequences, rather than let $() guess it. Over the next few versions we could tighten down $() to say that any HTML string passed to it must begin and end with angle brackets--no spaces or text on the ends--which might allow us to avoid the regex check.

If we had this new method, what should it return? Seems like a jQuery object with the nodes would be the most obvious, but it could return a plain Array of nodes or a documentFragment with the nodes.

I'm also open to bikeshedding on the method name, since it's not an analog to $.fn.html so perhaps $.fragment, $.nodes or similar.

Change History

comment:1 Changed 2 years ago by mikesherov

Just chiming in here that this function would obviously not be a getter/setter, because any getter/setter combo that was strict would break round-tripping.

comment:2 Changed 2 years ago by jaubourg

Why not $.parseHTML()? Would be on par with $.parseJSON() and $.parseXML() and be clear about obviously not being a getter.

+1 on the intent. I guess a fragment would be nice but a collection could do.

comment:3 Changed 2 years ago by rwaldron

  • Owner set to dmethvin
  • Priority changed from undecided to low
  • Status changed from new to assigned
  • Component changed from unfiled to manipulation
  • Type changed from bug to feature

comment:4 Changed 2 years ago by dmethvin

  • Summary changed from Define a $.html method for creating HTML fragments to Define a $.parseHTML method for creating HTML fragments
  • Milestone changed from None to 1.8

comment:5 Changed 2 years ago by scott.gonzalez

Just for reference, this is currently being spec'd as Document.parse().

comment:6 Changed 2 years ago by timmywil

This needs a little more discussion. I went ahead and implemented fixes for some things, but xss is still an issue. Copied from the meeting notes:

Want some way to control whether scripts run?

$.parseHTML(html, { allowScripts: true }); ? distinguish allow inline vs. external?

comment:7 Changed 2 years ago by timmywil

  • Priority changed from low to high

Also see #11290

comment:8 Changed 2 years ago by timmywil

  • Status changed from assigned to closed
  • Resolution set to fixed

Add parseHTML for explicitly parsing strings into html. Fixes #11617.

Changeset: e2497c682f26b7916d76cb2896c6fe621b376d82

comment:9 Changed 2 years ago by muddydixon

I checked below snippets in latest git master (4df3aaeab3f5c1f54d7564fe9973f6bf35664265). This XSS was not resolved. Why this bug was fixed?

http://bugs.jquery.com/ticket/9521#comment:28

comment:10 Changed 2 years ago by christianmusa@…

Hello! Just a quick comment, would be great if with an options object we can define which tags we allow.

Use case: rich editor in an iframe/div with contentEditable enabled, user paste some unknown html (word maybe) and we only want to allow the strong tag and remove or rename the others tag. This would be useful for many wyswyg html5 editors.

comment:11 Changed 2 years ago by rwaldron

christianmusa@… --

Sounds like a great use case for a plugin

comment:12 Changed 2 years ago by dmethvin

To clarify the purpose of this method, it is beginning the process of separating the two String cases that $() handles: selectors and HTML serialization. There was some discussion in #11974 that implied it may be misunderstood as some sort of "XSS-proof HTML processing." It is not.

We want to mitigate the chances that a jQuery dev calls $(selector) thinking it is a CSS selector, but Mr. Bad Guy has managed to get script into selector and therefore executes it. We're doing that by providing $.parseHTML and eventually locking down the HTML recognition of $() to a small subset.

If/when environments provide better ways to sandbox, the ability of $.parseHTML() to make the dev's intentions clear will come in handy. However, no matter the method, jQuery allows devs to parse and execute scripts. If they process complex HTML with script and allow that script to come from untrusted or corruptible sources, it is still possible to make successful XSS attacks--just as it would be if the dev wrote their code in bare DOM methods.

Note: See TracTickets for help on using tickets.