Bug Tracker

Opened 6 years ago

Closed 5 years ago

#14228 closed feature (migrated)

Extension point for intercepting HTML before DOM insertion

Reported by: nicholas@… Owned by:
Priority: low Milestone: 1.12/2.2
Component: manipulation Version: 1.10.2
Keywords: Cc:
Blocked by: Blocking:

Description

At my company, there's a desire to filter HTML strings as they go through methods like html() and append() to prevent XSS. Our code base is quite large and there are thousands of references to these types of methods that accept an HTML string for DOM insertion, so it's not feasible to replace each instance with our own method call.

As far as I can tell, there's no current way to intercept HTML strings before insertion. What I'm suggesting is adding some kind of extension point to jQuery Core that would allow someone to register a function that would receive the HTML before insertion and allow someone to change that HTML that would actually be inserted. This has the following potential uses:

  • custom XSS filters
  • auto linking filters (automatically turn an email address into a link, for example)
  • stripping out personal data

Just to be clear, I'm not suggesting including anything other than an extension point that would allow such filters to be written. Without this, I need to basically overwrite html(), append(), prepend(), and all the others to manfully check any string arguments.

Change History (10)

comment:1 Changed 6 years ago by dmethvin

Component: unfiledmanipulation
Status: newopen

A lot of us on the team consider the fact that we allow scripts to run by injecting them to be a bug that we can't remove for back-compat reasons, so we're sympathetic. I'll leave the ticket open for discussion to see if there's any way we can fix this.

comment:2 Changed 6 years ago by Rick Waldron

What is preventing something like this:

// Beginning of IIFE or AMD or whatever...

var html = jQuery.fn.html;
var append = jQuery.fn.append;

jQuery.fn.html = function() {
   // do the filtering
   return html.call(this, filtered/safe stuff);
};

jQuery.fn.append = function() {
   // do the filtering
   return append.call(this, filtered/safe stuff);
};
  

// End of IIFE or AMD or whatever...

comment:3 Changed 6 years ago by nicholas@…

Nothing is preventing it, however, it is inefficient and risky. There are probably many paths to inserting HTML through jQuery, and without tracking them all down, the risk remains.

This is essentially what I'm doing right now, but I'm afraid I might have missed some corner case (and of course, methods change between versions, I don't want to have to keep updating this functionality). It would be much more efficient and safe to have one place to insert this functionality.

comment:4 Changed 6 years ago by nicholas@…

I just wrote up a blog post on everything we had to do to secure jQuery: http://tech.blog.box.com/2013/08/securing-jquery-against-unintended-xss/

I would love there to be a formal HTML filter capability, and it seems like the injection points for that functionality are fairly straightforward (see the blog post).

comment:5 Changed 5 years ago by m_gol

I agree this would be nice to have.

comment:6 Changed 5 years ago by gibson042

Related: #14370; #14464

I don't know that we'd want to add another function for all input to pass through, but could definitely see exposing the arguments of our currently-existing .replace( rxhtmlTag, "<$1></$2>" ) as properties on jQuery. "Simple" extensions like no-innerHTML contents could still use a string replacement, but more complex ones like what this ticket is requesting could get wholesale filtering with something like the following (apologies for atrocious names):

var origPreMatch = jQuery.htmlPreMatch,
    origPreReplace = jQuery.htmlPreReplace;

jQuery.htmlPreMatch = /[\w\W]+/;
jQuery.htmlPreReplace = function( html ) {
    return extraProcess( html ).replace( origPreMatch, origPreReplace );
};

comment:7 Changed 5 years ago by dmethvin

Milestone: None1.12/2.2
Priority: undecidedlow

comment:8 in reply to:  6 ; Changed 5 years ago by m_gol

Replying to gibson042:

Related: #14370; #14464

I don't know that we'd want to add another function for all input to pass through

If we passed it through $.escapeHTML only if it's a function then we're saved from one additional function invocation on each manip operation and the only overhead is one additional if. Is that a lot?

I feel regex might not be enough in some cases and we really should make it possible to secure our manip methods.

comment:9 in reply to:  8 Changed 5 years ago by gibson042

Replying to m_gol:

If we passed it through $.escapeHTML only if it's a function then we're saved from one additional function invocation on each manip operation and the only overhead is one additional if. Is that a lot?

Well, it's still guaranteed to do at least two passes over the input (once for custom escaping and once to fix up self-closing tags) when someone sets up preprocessing, even if they're only trying to fix our mangling of script/textarea content (#14370) or attribute values (#14464). Come to think of it, that brings up the related question of whether preprocessing should precede, follow, or replace the currently-existing .replace( rxhtmlTag, "<$1></$2>" ).

I feel regex might not be enough in some cases and we really should make it possible to secure our manip methods.

str = str.replace( /[\w\W]+/, fn ) is equivalent to str = fn( str ) for any string-returning single-parameter fn, and therefore guaranteed to support the same use cases. This is purely a question of API surface area.

comment:10 Changed 5 years ago by m_gol

Resolution: migrated
Status: openclosed
Note: See TracTickets for help on using tickets.