Potential fnDraw optimization for large tables?

timtucker · January 2012

Just out of curiosity, I started playing around with one of the sections in fnDraw where the rows are removed from the table
[code]
/* When doing infinite scrolling, only remove child rows when sorting, filtering or start
* up. When not infinite scroll, always do it.
*/
if ( !oSettings.oScroll.bInfinite || !oSettings._bInitComplete || oSettings.bSorted || oSettings.bFiltered )
{
nTrs = oSettings.nTBody.childNodes;
for ( i=nTrs.length-1 ; i>=0 ; i-- )
{
nTrs[i].parentNode.removeChild( nTrs[i] );
}
}
[/code]

I was curious what the performance impact would be to use a simpler jQuery call to detach the nodes, i.e.:
[code]
/* When doing infinite scrolling, only remove child rows when sorting, filtering or start
* up. When not infinite scroll, always do it.
*/
if ( !oSettings.oScroll.bInfinite || !oSettings._bInitComplete || oSettings.bSorted || oSettings.bFiltered )
{
$(oSettings.nTBody).children().detach();
}
[/code]

Using detatch() vs. html('') means that all the events / data / etc is still preserved on the nodes -- from the little bit of testing in my own app that I've done so far it looks to have functionally the same effect.

The real interesting point came when I ran some tests with jsPerf: http://jsperf.com/remove-detach/7

Re-running the test in Firefox, IE, and Chrome with different values for the number of rows in my sample table, I noticed an interesting trend:

10 rows -> 50 rows -> 100 rows -> 500 rows -> 1000 rows -> 5000 rows -> 10000 rows

Remove vs. Detach (ops/sec)

IE 8
[quote]
10 rows
400,000 vs. 27,000

50 rows
380,000 vs. 26,000

100 rows
350,000 vs. 25,000

500 rows
18 vs 14,000

1000 rows
5 vs 2,000

5000 rows
0.25 vs 5

10,000 rows
0 vs 3
[/quote]

Firefox
[quote]
10-5,000 rows
gradual decrease from 10,000,000 -> 6,000,000 vs gradual decrease from 300,000 -> 215,000

10,000 rows
18 vs. 50,000
[/quote]

Chrome
[quote]
Very similar results to Firefox, gradual decrease for jQuery detach, but performance drops dramatically ~5,000 rows for remove
[/quote]

timtucker · January 2012

Note that this is when running on an i5 with 8GB -- I'd imagine that the cut off in performance for # of rows might be reached much sooner on a slower system.

allan · January 2012

Very interesting!

Could you possibly explain how the test works (might be my limited understanding of jsperf) - taking the 1000 row case as an example, does this mean that the 'removeIt' function could run 5 times in a second - removing 1000 rows? Does addRows() also get counted in that, since obviously that will need to rerun each time to add more data?

I'm not surprised that removeChild is much faster than jQuery for small tables, but I am for very large tables! Looking at their method of doing remove:

[code]
// keepData is for internal use only--do not document
remove: function( selector, keepData ) {
for ( var i = 0, elem; (elem = this[i]) != null; i++ ) {
if ( !selector || jQuery.filter( selector, [ elem ] ).length ) {
if ( !keepData && elem.nodeType === 1 ) {
jQuery.cleanData( elem.getElementsByTagName("*") );
jQuery.cleanData( [ elem ] );
}

if ( elem.parentNode ) {
elem.parentNode.removeChild( elem );
}
}
}

return this;
},
[/code]

It is very very similar. The only difference is in the way they address the elements - rather than checking element[i] twice they've effectively cached it - I wonder if that can make a significant difference with large tables... http://jsperf.com/remove-detach/9 - apparently not. jQuery detect is still flat out the winner.

Very curious since it has a lot more overhead - there must be something going on in the background that makes it much much faster - node caching or something. Worth some more investigation I think.

Allan

allan · January 2012

Hmm - just updated the test case to introduce a method like how jQuery caches the elements (its "sibling" function), and that shows a massive speed boost over all other methods: http://jsperf.com/remove-detach/9 .

I've got to admit that I'm having a hard time believing my eyes though - these numbers are so massively different!

For 5000 rows the "jQueryCacheRemove" function I added can do 5million+ operations per second, while the "removeIt" function can only do 100!

It is also significantly faster than all the others for just 10 rows. It would be amazing if this stands up, but I'd like to understand how the test is actually being run a bit more first :-)

Allan

timtucker · January 2012

I've only been using jsperf for a while, but from everything that I've read in the docs, nothing in the prep phase is counted towards the timing (only the code executed in each of the tests defined).

1,000 ops/second for a 5,000 row table would just mean that you could remove all the rows in the table 1,000 times per second.

Taking things a little further, I came across another technique here that's even simpler, but seems to outperform even the jQuery approach (they mention using node.firstChild, but switching to lastChild and caching the element boosts the performance even more):
http://stackoverflow.com/questions/683366/remove-all-the-children-dom-elements-in-div

[code]
var nChild;
while(nChild = oSettings.nTBody.lastChild) {
oSettings.nTBody.removeChild(nChild);
}
[/code]

See the results here:
http://jsperf.com/remove-detach/11

It looks like at least some of the difference is due to the overhead for array access and the call to get the length of childNodes: http://jsperf.com/array-vs-last-child/2

The Chrome results are particularly amazing -- for a 10,000 element table, nTBody.lastChild is about 6000x faster than nTBody.childNodes[9999].

Potential guesses as to why the performance could be so different (could be way off):
- The array for childNodes is being build on the fly each time it gets accessed
- The "array" isn't really an array behind the scenes (i.e. "childNodes[N]" is just taking N steps through a linked list)

It looks like the same approach could be taken in _fnDrawHead as well:
[code]
for ( k=0, kLen=aoLocal[i].nTr.childNodes.length ; k

timtucker · January 2012

Here's another bit that I noticed in _fnDraw -- there can only be 1 open row associated with a given table row, but the loop looking for open rows continues even after the row has been found.

Presumably this could slow things down a bit when doing a redraw on a large table with a lot of open rows.
[code]
/* If there is an open row - and it is attached to this parent - attach it on redraw */
if ( iOpenRows !== 0 )
{
for ( var k=0 ; k

allan · January 2012

> Remove

The results of your two new methods are absolutely phenomenal. Every time I've profiled DataTables performance in the past, this is one part that has always needed work, but I need thought that going through the linked list like that would be so much faster. Thinking about it now it makes perfect sense!

I've packages DataTables 1.9.0 beta 2 for release now, but I'll put this in for the next beta (it will be available as the nightly and on Github shortly) as I think this is an absolutely fantastic performance gain. Thanks so much for looking into this!

> Open rows

Yes you are quite right, there is a small performance gain that can be obtained there, but I think a fairly small one compared to the other since the number of open rows is typically fairly small, but certainly all performance gains are welcome in a function like fnDraw... :-)

Regards,
Allan

Potential fnDraw optimization for large tables?

Potential fnDraw optimization for large tables?

Replies

Howdy, Stranger!

Categories

DataTables

Potential fnDraw optimization for large tables?

Potential fnDraw optimization for large tables?

Replies

Howdy, Stranger!

Quick Links

Categories

DataTables