Skip to main content

Bug Tracker

Side navigation

#12642 closed bug (wontfix)

Opened October 03, 2012 02:01PM UTC

Closed October 15, 2012 03:18PM UTC

Last modified October 15, 2012 03:48PM UTC

jq improperly evaluates scripts inserted in dom fragments

Reported by: karger Owned by:
Priority: undecided Milestone: None
Component: manipulation Version: 1.8.2
Keywords: Cc:
Blocked by: Blocking:
Description

When jquery is used to insert scripts in the current document, they are evaluated as expected. However, jquery also evaluates scripts that are inserted into dom fragments, e.g. $('<head></head>').append('my script'). This violates the notion of dom fragments as standing apart from the document. In contrast, setting document.createElement().innerHTML does *not* lead to script evaluation. I believe the primitive operations are more correct in this case.

I have posted demonstration code at http://people.csail.mit.edu/karger/Jquery/script-eval.html

I have found superficially related bug http://bugs.jquery.com/ticket/11324 ; however, that bug seems to be about failing to clean scripts when they are inserted in the dom; not when they are in a dom fragment. Also, unlike the claim in that bug, this is not a "contrived" scenario. I need to parse some passed-in markup *without* executing any scripts I see, and appending that markup to a dom fragment seemed the obvious way.

Attachments (0)
Change History (15)

Changed October 03, 2012 02:11PM UTC by dmethvin comment:1

Hey karger, this sounds like a great use case for the new $.parseHTML() method that was introduced in 1.8. We're a bit behind in the docs for it, but it looks like $.parseHTML(htmlString, context, runScripts) where context is the document to be used, and runScripts is a Boolean indicating whether to evaluate script tags within the html. It returns an array of DOM elements which you can pass to $(). Does that work for you?

Changed October 03, 2012 02:13PM UTC by gibson042 comment:2

component: unfiledmanipulation

This is a subset of #11795, but might merit standalone discussion.

Changed October 03, 2012 02:17PM UTC by anonymous comment:3

Thanks for the quick response! $.parseHTML is exactly what I need (assuming it works in all browsers including IE) and was trying to simulate on my own.

Is there documentation anywhere? What is the "context" argument in this case? If I just want a dom fragment am I supposed to invoke $.parseHTML(string, false) and leave out the context parameter?

Changed October 03, 2012 02:21PM UTC by karger comment:4

Clarification---that "anonymous" reply is from me using wrong browser.

Changed October 03, 2012 03:21PM UTC by karger comment:5

On a quick test, it appears that $.parseHTML is unable to parse <html>, <head>, or <body> tags. This is unfortunate as one of the main reasons I am trying to parse is to properly separate the head from the body of the markup I receive.

Is there a technical barrier to parsing these tags? <head> and <body> are valid nodes in a dom fragment. Even parsing the <html> tag seems technically feasible, as it should return a document (distinct from the "current" document).

Are these the only problematic tags, so that if I manually separate the head and body portions I can count on the rest being parsed correctly?

Changed October 03, 2012 03:52PM UTC by dmethvin comment:6

The typical way to parse HTML strings is to assign them to the .innerHTML property of a div element. Since there are several elements that cannot be valid children of a div, we create wrappers as best we can. But in the case of something like a body or head you would need to start with a html element and even then I seem to recall oldIE won't do it. And if the string is a full HTML file starting with a doctype and html element, I am not sure we can parse it using .innerHTML at all, anywhere.

The only legitimate way to parse an entire document text that I know of is to parse it as XML, dealing with all the strict requirements that entails (e.g., no sloppy HTML5-ish parsing foregiveness). Then you could pull out the parts you need, convert them back to strings, and send them to $.parseHTML perhaps. You can't just import XML nodes into an HTML document, at least not in oldIE.

This issue might be addressable in some reasonable way in jQuery 2.0 using createContextualFragment perhaps, see #10426.

Changed October 03, 2012 04:08PM UTC by dmethvin comment:8

Right, there's the illegitimate way using Regex, now you have two problems. :)

I just verified my recollection that it's not possible to use .innerHTML on an html element in oldIE, this throws an error "Could not set the innerHTML property. Invalid target element for this operation."

http://jsfiddle.net/BnUGF/

Changed October 04, 2012 05:34AM UTC by karger comment:9

Actually, the newer browsers are beginning to offer DomParser.parseHTML() method that can take a whole document. It's already available in ff and on the way in webkit; in the meantime you can parse a whole document by setting innerHTML on a new documentElement as is demonstrated here in a js implementation of domParser.parseHTML in webkit.

Unfortunately IE is behind usual. I had hoped that you could generate a new head element and set innerHTML on that in order to parse a head's contents, but even that is forbidden by IE. Have you considered using e.g. ejohn's pure js html parser for that case, or does that bloat jquery too much? You wouldn't need to use the ejohn's parser all the way down, but just use it to find the children of the head; each these could be parsed the usual way via innerHTML, and the results concatenated.

Changed October 04, 2012 02:18PM UTC by dmethvin comment:10

There is no way we'd include a JS-based HTML parser in jQuery for an edge case like this. If you just need something quick and dirty, the plugin from Ben Alman might do.

Would you be interested in putting together a plugin that incorporated the other methods to see if they can create a viable cross-browser solution? I'm not sure it's ready for jQuery core but it could be useful in its own right.

Changed October 07, 2012 02:48PM UTC by karger comment:11

So I put together a library that combines the three methods (all implemented by other people). The natural approach seemed to be extended the DOMParser object, to try first in native, then set an innerHTML, and finally to use ejohn's parser (in which I fixed a couple bugs).

It wouldn't be hard to make this into a jQuery plugin, but should I? Or is it more natural to work in the DOMParser namespace?

If I understand your comment #10, I could indeed incorporate the "other" methods (native domParser, plus setting innerHTML), but the result would not be cross-browser without a solution for IE. Were you suggesting to not worry about IE?

It would seem to make sense for your $.parseHTML to invoke DOMParser on those platforms where it can...

Changed October 15, 2012 03:18PM UTC by mikesherov comment:12

resolution: → wontfix
status: newclosed

Changed October 15, 2012 03:19PM UTC by timmywil comment:13

We won't be doing this in core.

Changed October 15, 2012 03:36PM UTC by karger comment:14

I recognize not wanting to put a full-blown html parser in core. However, this discussion wandered off the original issue, which is that it appears to be a bug that jquery is evaluating scripts inserted in dom fragments. Is there a reason not to fix that bug, by the limited change of checking whether scripts are going into the (main) dom before evaluating them?

Changed October 15, 2012 03:48PM UTC by gibson042 comment:15

This is veering very close to #11795. I'm inclined to consider the behavior valuable, but even more important is being clear about what we do.