Bug Tracker

Opened 5 years ago

Closed 5 years ago

Last modified 5 years ago

#12642 closed bug (wontfix)

jq improperly evaluates scripts inserted in dom fragments

Reported by: karger Owned by:
Priority: undecided Milestone: None
Component: manipulation Version: 1.8.2
Keywords: Cc:
Blocked by: Blocking:

Description

When jquery is used to insert scripts in the current document, they are evaluated as expected. However, jquery also evaluates scripts that are inserted into dom fragments, e.g. $('<head></head>').append('my script'). This violates the notion of dom fragments as standing apart from the document. In contrast, setting document.createElement().innerHTML does *not* lead to script evaluation. I believe the primitive operations are more correct in this case.

I have posted demonstration code at http://people.csail.mit.edu/karger/Jquery/script-eval.html

I have found superficially related bug http://bugs.jquery.com/ticket/11324 ; however, that bug seems to be about failing to clean scripts when they are inserted in the dom; not when they are in a dom fragment. Also, unlike the claim in that bug, this is not a "contrived" scenario. I need to parse some passed-in markup *without* executing any scripts I see, and appending that markup to a dom fragment seemed the obvious way.

Change History (15)

comment:1 Changed 5 years ago by dmethvin

Hey karger, this sounds like a great use case for the new $.parseHTML() method that was introduced in 1.8. We're a bit behind in the docs for it, but it looks like $.parseHTML(htmlString, context, runScripts) where context is the document to be used, and runScripts is a Boolean indicating whether to evaluate script tags within the html. It returns an array of DOM elements which you can pass to $(). Does that work for you?

comment:2 Changed 5 years ago by gibson042

Component: unfiledmanipulation

This is a subset of #11795, but might merit standalone discussion.

comment:3 Changed 5 years ago by anonymous

Thanks for the quick response! $.parseHTML is exactly what I need (assuming it works in all browsers including IE) and was trying to simulate on my own.

Is there documentation anywhere? What is the "context" argument in this case? If I just want a dom fragment am I supposed to invoke $.parseHTML(string, false) and leave out the context parameter?

comment:4 in reply to:  3 Changed 5 years ago by karger

Clarification---that "anonymous" reply is from me using wrong browser.

comment:5 Changed 5 years ago by karger

On a quick test, it appears that $.parseHTML is unable to parse <html>, <head>, or <body> tags. This is unfortunate as one of the main reasons I am trying to parse is to properly separate the head from the body of the markup I receive.

Is there a technical barrier to parsing these tags? <head> and <body> are valid nodes in a dom fragment. Even parsing the <html> tag seems technically feasible, as it should return a document (distinct from the "current" document).

Are these the only problematic tags, so that if I manually separate the head and body portions I can count on the rest being parsed correctly?

comment:6 Changed 5 years ago by dmethvin

The typical way to parse HTML strings is to assign them to the .innerHTML property of a div element. Since there are several elements that cannot be valid children of a div, we create wrappers as best we can. But in the case of something like a body or head you would need to start with a html element and even then I seem to recall oldIE won't do it. And if the string is a full HTML file starting with a doctype and html element, I am not sure we can parse it using .innerHTML at all, anywhere.

The only legitimate way to parse an entire document text that I know of is to parse it as XML, dealing with all the strict requirements that entails (e.g., no sloppy HTML5-ish parsing foregiveness). Then you could pull out the parts you need, convert them back to strings, and send them to $.parseHTML perhaps. You can't just import XML nodes into an HTML document, at least not in oldIE.

This issue might be addressable in some reasonable way in jQuery 2.0 using createContextualFragment perhaps, see #10426.

comment:8 Changed 5 years ago by dmethvin

Right, there's the illegitimate way using Regex, now you have two problems. :)

I just verified my recollection that it's not possible to use .innerHTML on an html element in oldIE, this throws an error "Could not set the innerHTML property. Invalid target element for this operation."

http://jsfiddle.net/BnUGF/

comment:9 Changed 5 years ago by karger

Actually, the newer browsers are beginning to offer DomParser.parseHTML() method that can take a whole document. It's already available in ff and on the way in webkit; in the meantime you can parse a whole document by setting innerHTML on a new documentElement as is demonstrated here in a js implementation of domParser.parseHTML in webkit.

Unfortunately IE is behind usual. I had hoped that you could generate a new head element and set innerHTML on that in order to parse a head's contents, but even that is forbidden by IE. Have you considered using e.g. ejohn's pure js html parser for that case, or does that bloat jquery too much? You wouldn't need to use the ejohn's parser all the way down, but just use it to find the children of the head; each these could be parsed the usual way via innerHTML, and the results concatenated.

comment:10 Changed 5 years ago by dmethvin

There is no way we'd include a JS-based HTML parser in jQuery for an edge case like this. If you just need something quick and dirty, the plugin from Ben Alman might do.

Would you be interested in putting together a plugin that incorporated the other methods to see if they can create a viable cross-browser solution? I'm not sure it's ready for jQuery core but it could be useful in its own right.

comment:11 Changed 5 years ago by karger

So I put together a library that combines the three methods (all implemented by other people). The natural approach seemed to be extended the DOMParser object, to try first in native, then set an innerHTML, and finally to use ejohn's parser (in which I fixed a couple bugs).

It wouldn't be hard to make this into a jQuery plugin, but should I? Or is it more natural to work in the DOMParser namespace?

If I understand your comment #10, I could indeed incorporate the "other" methods (native domParser, plus setting innerHTML), but the result would not be cross-browser without a solution for IE. Were you suggesting to not worry about IE?

It would seem to make sense for your $.parseHTML to invoke DOMParser on those platforms where it can...

comment:12 Changed 5 years ago by mikesherov

Resolution: wontfix
Status: newclosed

comment:13 Changed 5 years ago by timmywil

We won't be doing this in core.

comment:14 Changed 5 years ago by karger

I recognize not wanting to put a full-blown html parser in core. However, this discussion wandered off the original issue, which is that it appears to be a bug that jquery is evaluating scripts inserted in dom fragments. Is there a reason not to fix that bug, by the limited change of checking whether scripts are going into the (main) dom before evaluating them?

comment:15 Changed 5 years ago by gibson042

This is veering very close to #11795. I'm inclined to consider the behavior valuable, but even more important is being clear about what we do.

Note: See TracTickets for help on using tickets.