Skip to main content

Bug Tracker

Side navigation

#6863 closed enhancement (fixed)

Opened August 03, 2010 02:15PM UTC

Closed September 28, 2011 08:47PM UTC

Last modified March 08, 2012 09:54PM UTC

faster getText

Reported by: steida Owned by: timmywil
Priority: blocker Milestone: 1.7
Component: selector Version: 1.4.2
Keywords: speed,getText,1.7-discuss Cc:
Blocked by: Blocking:
Description

var getText = function (elem) {

var text = [];

for (var node = elem.firstChild; node; node = node.nextSibling) {

Get the text from text nodes and CDATA nodes

if (node.nodeType == 3 || node.nodeType == 4)

text.push(node.nodeValue);

Traverse everything else, except comment nodes

else if (elem.nodeType !== 8)

text.push(getText(node));

}

return text.join('');

};

Attachments (0)
Change History (25)

Changed August 04, 2010 12:53AM UTC by dmethvin comment:1

component: unfiledselector

This wouldn't make much of a difference unless there were a lot of strings being concatenated, which is probably not the most common case. However, it seems like pretty low hanging fruit and I know it makes a big difference in IE6/7.

Changed August 05, 2010 04:20PM UTC by steida comment:2

Also DOM traversing by nextSibling is almost twice faster then index based.

Changed August 07, 2010 04:31PM UTC by WouterTinus comment:3

This new implementation suffers from the same inconsistency as the current version, reported in ticket:6827. A corrected version would be

var getText = function (elem) {
	var text = []; 
	for (var node = elem.firstChild; node; node = node.nextSibling) {

		// Get the text from text nodes and CDATA nodes
		if (node.nodeType == 3 || node.nodeType == 4)
			text.push(node.nodeValue);

		// Special case for script nodes
		else if (node.tagName === 'SCRIPT')
			text.push(node.text);

		// Traverse everything else, except comment nodes 
		else if (elem.nodeType !== 8)
			text.push(getText(node));

	} 
	return text.join();
};

Changed October 27, 2010 05:49PM UTC by SlexAxton comment:4

keywords: → speed getText
milestone: 1.4.31.5
priority: → low
status: newopen

Changed December 30, 2010 04:26AM UTC by paul.irish comment:5

Should definitely be using elem.textContent if supported.

Changed April 16, 2011 11:25PM UTC by john comment:6

Replying to [comment:5 paul.irish]:

Should definitely be using elem.textContent if supported.

We had problems getting a consistent value back from the built-in string methods. The only way to get consistent results across all browsers was to do the traversal.

Naturally, a patch to the contrary would be welcome.

Changed April 16, 2011 11:25PM UTC by john comment:7

milestone: → 1.next

This is a relatively simple change (using arrays instead of concatenation). Should be low-hanging fruit for someone - possibly in 1.7.

Changed April 17, 2011 08:52PM UTC by wookiehangover comment:8

owner: → wookiehangover
status: openassigned

I'd like to take a crack at this!

Changed April 18, 2011 01:20AM UTC by wookiehangover comment:9

owner: wookiehangover
status: assignedopen

didn't realize that this was related to sizzle... bailing out

Changed April 27, 2011 08:34PM UTC by barberboy comment:10

I created a pull request that fixes this issue at http://github.com/jquery/sizzle/pull/59.

As noted in the pull request, this change may have a negative impact on performance in Chrome, Firefox 4 and IE8+. It will have a minor performance benefit in IE6 and IE7, Safari (as of version 5.0.5) and Opera (as of 11.10).

http://jsperf.com/ultimate-string-concatenation-tests.

Changed April 29, 2011 04:57PM UTC by barberboy comment:11

https://github.com/jquery/sizzle/pull/59

I updated the pull request based on feedback from dmethvin and created a jsPerf that measures the performance improvement: http://jsperf.com/sizzle-gettext-bugfix-6863.

Changed May 22, 2011 07:27PM UTC by john comment:12

keywords: speed getTextspeed,getText,1.7-discuss

Nominating ticket for 1.7 discussion.

Changed May 22, 2011 10:06PM UTC by rwaldron comment:13

_comment0: +1, Changes actually go to Sizzle I believe1306102007682618
description: var getText = function (elem) { \ var text = []; \ for (var node = elem.firstChild; node; node = node.nextSibling) { \ // Get the text from text nodes and CDATA nodes \ if (node.nodeType == 3 || node.nodeType == 4) \ text.push(node.nodeValue); \ // Traverse everything else, except comment nodes \ else if (elem.nodeType !== 8) \ text.push(getText(node)); \ } \ return text.join(''); \ };var getText = function (elem) {\ var text = [];\ for (var node = elem.firstChild; node; node = node.nextSibling) {\ // Get the text from text nodes and CDATA nodes\ if (node.nodeType == 3 || node.nodeType == 4)\ text.push(node.nodeValue);\ // Traverse everything else, except comment nodes\ else if (elem.nodeType !== 8)\ text.push(getText(node));\ }\ return text.join('');\ };

+1, Changes actually go to Sizzle I believe - if it can be reasonably achieved

Changed May 23, 2011 12:08AM UTC by jaubourg comment:14

+1, If performance gain is proven... and only if it proven.

Changed May 23, 2011 02:04AM UTC by ajpiano comment:15

description: var getText = function (elem) {\ var text = [];\ for (var node = elem.firstChild; node; node = node.nextSibling) {\ // Get the text from text nodes and CDATA nodes\ if (node.nodeType == 3 || node.nodeType == 4)\ text.push(node.nodeValue);\ // Traverse everything else, except comment nodes\ else if (elem.nodeType !== 8)\ text.push(getText(node));\ }\ return text.join('');\ };var getText = function (elem) { \ var text = []; \ for (var node = elem.firstChild; node; node = node.nextSibling) { \ // Get the text from text nodes and CDATA nodes \ if (node.nodeType == 3 || node.nodeType == 4) \ text.push(node.nodeValue); \ // Traverse everything else, except comment nodes \ else if (elem.nodeType !== 8) \ text.push(getText(node)); \ } \ return text.join(''); \ };

+1, Yes, but also +1 on jaubourg's sentiment above !

Changed May 23, 2011 03:31AM UTC by timmywil comment:16

description: var getText = function (elem) { \ var text = []; \ for (var node = elem.firstChild; node; node = node.nextSibling) { \ // Get the text from text nodes and CDATA nodes \ if (node.nodeType == 3 || node.nodeType == 4) \ text.push(node.nodeValue); \ // Traverse everything else, except comment nodes \ else if (elem.nodeType !== 8) \ text.push(getText(node)); \ } \ return text.join(''); \ };var getText = function (elem) {\ var text = [];\ for (var node = elem.firstChild; node; node = node.nextSibling) {\ // Get the text from text nodes and CDATA nodes\ if (node.nodeType == 3 || node.nodeType == 4)\ text.push(node.nodeValue);\ // Traverse everything else, except comment nodes\ else if (elem.nodeType !== 8)\ text.push(getText(node));\ }\ return text.join('');\ };

+0, skeptical on the performance gain.

Changed May 23, 2011 09:51PM UTC by dmethvin comment:17

description: var getText = function (elem) {\ var text = [];\ for (var node = elem.firstChild; node; node = node.nextSibling) {\ // Get the text from text nodes and CDATA nodes\ if (node.nodeType == 3 || node.nodeType == 4)\ text.push(node.nodeValue);\ // Traverse everything else, except comment nodes\ else if (elem.nodeType !== 8)\ text.push(getText(node));\ }\ return text.join('');\ };var getText = function (elem) { \ var text = []; \ for (var node = elem.firstChild; node; node = node.nextSibling) { \ // Get the text from text nodes and CDATA nodes \ if (node.nodeType == 3 || node.nodeType == 4) \ text.push(node.nodeValue); \ // Traverse everything else, except comment nodes \ else if (elem.nodeType !== 8) \ text.push(getText(node)); \ } \ return text.join(''); \ };

+1, Worth a look with #3144 as well.

Changed May 25, 2011 06:39PM UTC by dmethvin comment:18

See also this forum post with a link to perf tests there and also linked above:

http://forum.jquery.com/topic/sizzle-gettext-open-pull-request-for-ticket-6863#14737000002377958

Would still like to see #3144 tackled at the same time if possible to reduce some of the quirks of IE6/7/8 text behavior using the code suggested by ioquatix and linked in the fiddle at the bottom. It might make the function even faster since it eliminates recursion for all but XML documents.

Changed June 03, 2011 01:36PM UTC by john comment:19

description: var getText = function (elem) { \ var text = []; \ for (var node = elem.firstChild; node; node = node.nextSibling) { \ // Get the text from text nodes and CDATA nodes \ if (node.nodeType == 3 || node.nodeType == 4) \ text.push(node.nodeValue); \ // Traverse everything else, except comment nodes \ else if (elem.nodeType !== 8) \ text.push(getText(node)); \ } \ return text.join(''); \ };var getText = function (elem) {\ var text = [];\ for (var node = elem.firstChild; node; node = node.nextSibling) {\ // Get the text from text nodes and CDATA nodes\ if (node.nodeType == 3 || node.nodeType == 4)\ text.push(node.nodeValue);\ // Traverse everything else, except comment nodes\ else if (elem.nodeType !== 8)\ text.push(getText(node));\ }\ return text.join('');\ };

+1, Well, certainly seems to be running faster. If it's passing, let's get it in.

Changed June 03, 2011 02:55PM UTC by scottgonzalez comment:20

+1, perf test shows improvements across all browsers

Changed June 06, 2011 03:52PM UTC by jzaefferer comment:21

+1

Changed July 12, 2011 02:54PM UTC by dmethvin comment:22

description: var getText = function (elem) {\ var text = [];\ for (var node = elem.firstChild; node; node = node.nextSibling) {\ // Get the text from text nodes and CDATA nodes\ if (node.nodeType == 3 || node.nodeType == 4)\ text.push(node.nodeValue);\ // Traverse everything else, except comment nodes\ else if (elem.nodeType !== 8)\ text.push(getText(node));\ }\ return text.join('');\ };var getText = function (elem) { \ var text = []; \ for (var node = elem.firstChild; node; node = node.nextSibling) { \ // Get the text from text nodes and CDATA nodes \ if (node.nodeType == 3 || node.nodeType == 4) \ text.push(node.nodeValue); \ // Traverse everything else, except comment nodes \ else if (elem.nodeType !== 8) \ text.push(getText(node)); \ } \ return text.join(''); \ };
milestone: 1.next1.7
priority: lowblocker

Changed July 25, 2011 04:19PM UTC by dmethvin comment:23

owner: → dmethvin
status: openassigned

Changed September 22, 2011 02:07PM UTC by dmethvin comment:24

owner: dmethvintimmywil

Thanks for taking this timmywil!

TL;DR: There is a pull already for this in Sizzle's tracker:

http://bugs.jquery.com/ticket/6863#comment:10

Ticket #3144 has an alternate implementation using .textContent/.innerText when possible that may be even faster, and it addresses issues with IE6/7/8 losing spaces.

Changed September 28, 2011 08:47PM UTC by timmywil comment:25

resolution: → fixed
status: assignedclosed

Update Sizzle. Fixes #3144, #6863.

Changeset: 22fcc7744daded0dc0783d85df3bd88c6dbc4544