Side navigation
#6863 closed enhancement (fixed)
Opened August 03, 2010 02:15PM UTC
Closed September 28, 2011 08:47PM UTC
Last modified March 08, 2012 09:54PM UTC
faster getText
Reported by: | steida | Owned by: | timmywil |
---|---|---|---|
Priority: | blocker | Milestone: | 1.7 |
Component: | selector | Version: | 1.4.2 |
Keywords: | speed,getText,1.7-discuss | Cc: | |
Blocked by: | Blocking: |
Description
var getText = function (elem) {
var text = [];
for (var node = elem.firstChild; node; node = node.nextSibling) {
Get the text from text nodes and CDATA nodes
if (node.nodeType == 3 || node.nodeType == 4)
text.push(node.nodeValue);
Traverse everything else, except comment nodes
else if (elem.nodeType !== 8)
text.push(getText(node));
}
return text.join('');
};
Attachments (0)
Change History (25)
Changed August 04, 2010 12:53AM UTC by comment:1
component: | unfiled → selector |
---|
Changed August 05, 2010 04:20PM UTC by comment:2
Also DOM traversing by nextSibling is almost twice faster then index based.
Changed August 07, 2010 04:31PM UTC by comment:3
This new implementation suffers from the same inconsistency as the current version, reported in ticket:6827. A corrected version would be
var getText = function (elem) { var text = []; for (var node = elem.firstChild; node; node = node.nextSibling) { // Get the text from text nodes and CDATA nodes if (node.nodeType == 3 || node.nodeType == 4) text.push(node.nodeValue); // Special case for script nodes else if (node.tagName === 'SCRIPT') text.push(node.text); // Traverse everything else, except comment nodes else if (elem.nodeType !== 8) text.push(getText(node)); } return text.join(); };
Changed October 27, 2010 05:49PM UTC by comment:4
keywords: | → speed getText |
---|---|
milestone: | 1.4.3 → 1.5 |
priority: | → low |
status: | new → open |
Changed December 30, 2010 04:26AM UTC by comment:5
Should definitely be using elem.textContent
if supported.
Changed April 16, 2011 11:25PM UTC by comment:6
Replying to [comment:5 paul.irish]:
Should definitely be using elem.textContent
if supported.
We had problems getting a consistent value back from the built-in string methods. The only way to get consistent results across all browsers was to do the traversal.
Naturally, a patch to the contrary would be welcome.
Changed April 16, 2011 11:25PM UTC by comment:7
milestone: | → 1.next |
---|
This is a relatively simple change (using arrays instead of concatenation). Should be low-hanging fruit for someone - possibly in 1.7.
Changed April 17, 2011 08:52PM UTC by comment:8
owner: | → wookiehangover |
---|---|
status: | open → assigned |
I'd like to take a crack at this!
Changed April 18, 2011 01:20AM UTC by comment:9
owner: | wookiehangover |
---|---|
status: | assigned → open |
didn't realize that this was related to sizzle... bailing out
Changed April 27, 2011 08:34PM UTC by comment:10
I created a pull request that fixes this issue at http://github.com/jquery/sizzle/pull/59.
As noted in the pull request, this change may have a negative impact on performance in Chrome, Firefox 4 and IE8+. It will have a minor performance benefit in IE6 and IE7, Safari (as of version 5.0.5) and Opera (as of 11.10).
Changed April 29, 2011 04:57PM UTC by comment:11
https://github.com/jquery/sizzle/pull/59
I updated the pull request based on feedback from dmethvin and created a jsPerf that measures the performance improvement: http://jsperf.com/sizzle-gettext-bugfix-6863.
Changed May 22, 2011 07:27PM UTC by comment:12
keywords: | speed getText → speed,getText,1.7-discuss |
---|
Nominating ticket for 1.7 discussion.
Changed May 22, 2011 10:06PM UTC by comment:13
_comment0: | +1, Changes actually go to Sizzle I believe → 1306102007682618 |
---|---|
description: | var getText = function (elem) { \ var text = []; \ for (var node = elem.firstChild; node; node = node.nextSibling) { \ // Get the text from text nodes and CDATA nodes \ if (node.nodeType == 3 || node.nodeType == 4) \ text.push(node.nodeValue); \ // Traverse everything else, except comment nodes \ else if (elem.nodeType !== 8) \ text.push(getText(node)); \ } \ return text.join(''); \ }; → var getText = function (elem) {\ var text = [];\ for (var node = elem.firstChild; node; node = node.nextSibling) {\ // Get the text from text nodes and CDATA nodes\ if (node.nodeType == 3 || node.nodeType == 4)\ text.push(node.nodeValue);\ // Traverse everything else, except comment nodes\ else if (elem.nodeType !== 8)\ text.push(getText(node));\ }\ return text.join('');\ }; |
+1, Changes actually go to Sizzle I believe - if it can be reasonably achieved
Changed May 23, 2011 12:08AM UTC by comment:14
+1, If performance gain is proven... and only if it proven.
Changed May 23, 2011 02:04AM UTC by comment:15
description: | var getText = function (elem) {\ var text = [];\ for (var node = elem.firstChild; node; node = node.nextSibling) {\ // Get the text from text nodes and CDATA nodes\ if (node.nodeType == 3 || node.nodeType == 4)\ text.push(node.nodeValue);\ // Traverse everything else, except comment nodes\ else if (elem.nodeType !== 8)\ text.push(getText(node));\ }\ return text.join('');\ }; → var getText = function (elem) { \ var text = []; \ for (var node = elem.firstChild; node; node = node.nextSibling) { \ // Get the text from text nodes and CDATA nodes \ if (node.nodeType == 3 || node.nodeType == 4) \ text.push(node.nodeValue); \ // Traverse everything else, except comment nodes \ else if (elem.nodeType !== 8) \ text.push(getText(node)); \ } \ return text.join(''); \ }; |
---|
+1, Yes, but also +1 on jaubourg's sentiment above !
Changed May 23, 2011 03:31AM UTC by comment:16
description: | var getText = function (elem) { \ var text = []; \ for (var node = elem.firstChild; node; node = node.nextSibling) { \ // Get the text from text nodes and CDATA nodes \ if (node.nodeType == 3 || node.nodeType == 4) \ text.push(node.nodeValue); \ // Traverse everything else, except comment nodes \ else if (elem.nodeType !== 8) \ text.push(getText(node)); \ } \ return text.join(''); \ }; → var getText = function (elem) {\ var text = [];\ for (var node = elem.firstChild; node; node = node.nextSibling) {\ // Get the text from text nodes and CDATA nodes\ if (node.nodeType == 3 || node.nodeType == 4)\ text.push(node.nodeValue);\ // Traverse everything else, except comment nodes\ else if (elem.nodeType !== 8)\ text.push(getText(node));\ }\ return text.join('');\ }; |
---|
+0, skeptical on the performance gain.
Changed May 23, 2011 09:51PM UTC by comment:17
description: | var getText = function (elem) {\ var text = [];\ for (var node = elem.firstChild; node; node = node.nextSibling) {\ // Get the text from text nodes and CDATA nodes\ if (node.nodeType == 3 || node.nodeType == 4)\ text.push(node.nodeValue);\ // Traverse everything else, except comment nodes\ else if (elem.nodeType !== 8)\ text.push(getText(node));\ }\ return text.join('');\ }; → var getText = function (elem) { \ var text = []; \ for (var node = elem.firstChild; node; node = node.nextSibling) { \ // Get the text from text nodes and CDATA nodes \ if (node.nodeType == 3 || node.nodeType == 4) \ text.push(node.nodeValue); \ // Traverse everything else, except comment nodes \ else if (elem.nodeType !== 8) \ text.push(getText(node)); \ } \ return text.join(''); \ }; |
---|
+1, Worth a look with #3144 as well.
Changed May 25, 2011 06:39PM UTC by comment:18
See also this forum post with a link to perf tests there and also linked above:
http://forum.jquery.com/topic/sizzle-gettext-open-pull-request-for-ticket-6863#14737000002377958
Would still like to see #3144 tackled at the same time if possible to reduce some of the quirks of IE6/7/8 text behavior using the code suggested by ioquatix and linked in the fiddle at the bottom. It might make the function even faster since it eliminates recursion for all but XML documents.
Changed June 03, 2011 01:36PM UTC by comment:19
description: | var getText = function (elem) { \ var text = []; \ for (var node = elem.firstChild; node; node = node.nextSibling) { \ // Get the text from text nodes and CDATA nodes \ if (node.nodeType == 3 || node.nodeType == 4) \ text.push(node.nodeValue); \ // Traverse everything else, except comment nodes \ else if (elem.nodeType !== 8) \ text.push(getText(node)); \ } \ return text.join(''); \ }; → var getText = function (elem) {\ var text = [];\ for (var node = elem.firstChild; node; node = node.nextSibling) {\ // Get the text from text nodes and CDATA nodes\ if (node.nodeType == 3 || node.nodeType == 4)\ text.push(node.nodeValue);\ // Traverse everything else, except comment nodes\ else if (elem.nodeType !== 8)\ text.push(getText(node));\ }\ return text.join('');\ }; |
---|
+1, Well, certainly seems to be running faster. If it's passing, let's get it in.
Changed June 03, 2011 02:55PM UTC by comment:20
+1, perf test shows improvements across all browsers
Changed June 06, 2011 03:52PM UTC by comment:21
+1
Changed July 12, 2011 02:54PM UTC by comment:22
description: | var getText = function (elem) {\ var text = [];\ for (var node = elem.firstChild; node; node = node.nextSibling) {\ // Get the text from text nodes and CDATA nodes\ if (node.nodeType == 3 || node.nodeType == 4)\ text.push(node.nodeValue);\ // Traverse everything else, except comment nodes\ else if (elem.nodeType !== 8)\ text.push(getText(node));\ }\ return text.join('');\ }; → var getText = function (elem) { \ var text = []; \ for (var node = elem.firstChild; node; node = node.nextSibling) { \ // Get the text from text nodes and CDATA nodes \ if (node.nodeType == 3 || node.nodeType == 4) \ text.push(node.nodeValue); \ // Traverse everything else, except comment nodes \ else if (elem.nodeType !== 8) \ text.push(getText(node)); \ } \ return text.join(''); \ }; |
---|---|
milestone: | 1.next → 1.7 |
priority: | low → blocker |
Changed July 25, 2011 04:19PM UTC by comment:23
owner: | → dmethvin |
---|---|
status: | open → assigned |
Changed September 22, 2011 02:07PM UTC by comment:24
owner: | dmethvin → timmywil |
---|
Thanks for taking this timmywil!
TL;DR: There is a pull already for this in Sizzle's tracker:
http://bugs.jquery.com/ticket/6863#comment:10
Ticket #3144 has an alternate implementation using .textContent/.innerText when possible that may be even faster, and it addresses issues with IE6/7/8 losing spaces.
This wouldn't make much of a difference unless there were a lot of strings being concatenated, which is probably not the most common case. However, it seems like pretty low hanging fruit and I know it makes a big difference in IE6/7.