Bug Tracker

Opened 13 years ago

Closed 13 years ago

Last modified 10 years ago

#4037 closed bug (fixed)

800x faster HTML load on large HTML chunks

Reported by: dimi Owned by:
Priority: major Milestone: 1.3.2
Component: core Version: 1.3.1
Keywords: performance Cc: [email protected]
Blocked by: Blocking:


I was using .load() to load a large chunk of HTML (660K) into my document, when I noticed we are spending almost 3000ms into the .clean() function. Most of the time was spent in .trim(), but beyond that we are doing there a _lot_ of work that's not needed.

I have prepared 2 patches that make a huge difference in performance. Both are trying the remove the bottleneck in the part of code that says "Convert html string into DOM nodes".

Patch number one just gets rid of the unneeded .trim(), and get the time spent in that part of the code from about 2700ms to about 440ms, that more than a 5x speed improvement.

Patch number two tries to get rid of the large memory allocations and tons of unnecessary work that's left, and it's just a little bit more complicated. This one brings the same area of code to only 4ms! That's an almost 800x speed improvement, and with larger chunks it's gonna be bigger.

Size impact is tiny as well. Patch one adds just 6 bytes to the minimized version, or only 4 bytes to the gziped version. Patch two adds 17 bytes on top of patch one to the minimized version, or only 12 bytes to the gziped version.

Attachments (2)

clean-1.diff (364 bytes) - added by dimi 13 years ago.
Patch one
clean-2.diff (919 bytes) - added by dimi 13 years ago.
Patch two

Download all attachments as: .zip

Change History (6)

Changed 13 years ago by dimi

Attachment: clean-1.diff added

Patch one

Changed 13 years ago by dimi

Attachment: clean-2.diff added

Patch two

comment:1 Changed 13 years ago by dimi

My blog entry about it: http://zipalong.com/blog/?p=300

comment:2 Changed 13 years ago by john

Resolution: fixed
Status: newclosed

Fixed in SVN rev [6190].

comment:3 Changed 13 years ago by pbcomm

It would be better to change trim to:

var start = -1, end = str.length; while (str.charCodeAt(--end) < 33); while (++start < end && str.charCodeAt(start) < 33); return str.slice(start, end + 1);

I don't remember exactly where I got this, I think there is a ticket in here about it. Yes, it's a little more code, but it is so much faster. I love regex, but in js it is just too slow.

comment:4 Changed 13 years ago by dimi

I agree that we can have a faster .trim(), but I disagree what it would be better to use trim in this context.

Even the fastest trim() I've seen will be much slower than the above code? Pls read my blog entry on this subject, I really don't think you can speed it up all that much, given that it's just a few ms (which, BTW, is faster than the measurements published on the site from where you got that code snippet :)).

Note: See TracTickets for help on using tickets.