Speeding up large tables without serverside processing.
Speeding up large tables without serverside processing.
Thefinaleofseem
Posts: 11Questions: 0Answers: 0
I've been involved in a project that has made use of Datatables 1.9.4. We've meshed it together with Angularjs and have been happy with the results. However, the datasets being loaded into it have continued to grow to the point where it's getting a bit ugly. We're looking at close to 10k records in 10-15 columns in some cases, maybe even more.
As you can imagine, that has bogged down the loading of tables pretty badly, sometimes requiring 2-3 clicks on the "slow script" warning until it loads. I've made a few tweaks, such as enabling deferred rendering, removing the bSortClasses, and going through the whole thing with the Firefox JS profiler to fix some slow functions in our formatting code. It has certainly helped, but larger tables are still bringing up the slow script warning. The profiler shows the vast majority of the time is being spent in jQuery, but it's difficult to tell exactly why or how.
Server-side processing might be viable down the road, but would take far to much work and the gutting of much of the work that has been done until now. It's not going to be viable for some time. I played with setTimeout() and adding data to the table a bit at a time after loading an initial chunk, ie, if this table will have 500+ rows, load up the first 500, render the table with them, and then use setTimeout to load up the remainder in 100 row chunks to avoid blocking the browser. This sort of worked, but adding rows after the table is built is quite slow compared to simply building it outright and the browser was chugging a bit between additions. I'm wondering if I can work around this a bit by using a Web Worker, but doing so without seriously overhauling the implementation doesn't seem possible.
Are there any further tips to speed this thing up?
For a bit of additional detail, here are the the settings and such we're using (I can't post code as there's simply too much around it):
Columns/ColumnDefs/Column data: Loaded in from preprocessed JSON
Sorting: Tables have default sorting, tried presorting with the setTimeout, still a bit chuggy
bJQueryUI: true
bAutoWidth: false
bSortClasses: false
bDeferRender: true
sPagingationType: "full_numbers"
iDisplayLength: 20
oLanguage: ...
oTableTools {...}
fnDrawCallback: function() { //a few checks and tweaks depending on the table shown}
fnRowCallback: function() {//Styles applied depending on data to conditionally include/exclude some rows after the table is built, ie, a checkbox to hide certain rows}
There are some additional methods afterward to insert custom elements and styles as well once those are set.
As you can imagine, that has bogged down the loading of tables pretty badly, sometimes requiring 2-3 clicks on the "slow script" warning until it loads. I've made a few tweaks, such as enabling deferred rendering, removing the bSortClasses, and going through the whole thing with the Firefox JS profiler to fix some slow functions in our formatting code. It has certainly helped, but larger tables are still bringing up the slow script warning. The profiler shows the vast majority of the time is being spent in jQuery, but it's difficult to tell exactly why or how.
Server-side processing might be viable down the road, but would take far to much work and the gutting of much of the work that has been done until now. It's not going to be viable for some time. I played with setTimeout() and adding data to the table a bit at a time after loading an initial chunk, ie, if this table will have 500+ rows, load up the first 500, render the table with them, and then use setTimeout to load up the remainder in 100 row chunks to avoid blocking the browser. This sort of worked, but adding rows after the table is built is quite slow compared to simply building it outright and the browser was chugging a bit between additions. I'm wondering if I can work around this a bit by using a Web Worker, but doing so without seriously overhauling the implementation doesn't seem possible.
Are there any further tips to speed this thing up?
For a bit of additional detail, here are the the settings and such we're using (I can't post code as there's simply too much around it):
Columns/ColumnDefs/Column data: Loaded in from preprocessed JSON
Sorting: Tables have default sorting, tried presorting with the setTimeout, still a bit chuggy
bJQueryUI: true
bAutoWidth: false
bSortClasses: false
bDeferRender: true
sPagingationType: "full_numbers"
iDisplayLength: 20
oLanguage: ...
oTableTools {...}
fnDrawCallback: function() { //a few checks and tweaks depending on the table shown}
fnRowCallback: function() {//Styles applied depending on data to conditionally include/exclude some rows after the table is built, ie, a checkbox to hide certain rows}
There are some additional methods afterward to insert custom elements and styles as well once those are set.
This discussion has been closed.
Replies
So I'm not too sure what is slowing your table down.
Allan
In this table, fnDrawCallback only executes if a particular table is selected, so it's not especially relevant. fnRowcallback is much the same. The only thing these would do in many cases is an equality check against a string.
There are a number of post-construction actions (attach listeners, classes, etc, custom ui element to the header perhaps, add per-column filters, etc), but these don't seem to be where the worst of it is going. Commenting them out does little to help. It should also be noted that the table is not built when the page is loaded in most cases. Usually the table is empty until the user clicks on an element, which will then destroy the existing table and create a new one with the data. This is done whenever a user clicks on an element to show a new table.
I'm still tossing the web worker around in my head, but I'm still not sure how it could work out unless I want to build the entire table in a separate thread and keep a loading indicator up until it's complete, then pass it back and insert it. It would keep the browser from blocking, at least. Adding data in chunks to an existing table is pretty slow (definitely slower than building and formatting rows prior to insertion), and it seems to get worse as the table gets larger. I don't know if a web worker can get around that without pulling the entire table into a new thread, and then the question remains of how to deal with that...
1.10-pre is here: https://github.com/DataTables/DataTables/tree/1_10_wip/media/js
Allan
I tried the dev version. There's a minor speed bump, but just from the quick check I did, I don't think it goes very far into the double-digit percentage range. Definitely helpful, but still not the full picture. I suppose that comes with the territory when dealing with older and slower Javascript engines. I'll have to do some more tinkering with a web worker to see if I can figure out how to break this thing up.
I think I'd need a link to the page to be able to offer any more help.
Thanks,
Allan
_fnBuildSearchRow: This will strip out any HTML when constructing the search array. However, you could potentially have HTML in every row of a table. 5k+ calls to jQuery's .html() function can add up. Commenting this bit out improved rendering speed by about 20% or so. Many tables may have HTML in every row, but the original dataset may not have that HTML formatting, ie, aoColumnDefs/aoColumns is used to add the HTML. Would it be possible to pass a separate dataset for the search array? Doing so and excluding any HTML-adding functions could considerably speed things up. Maybe this option already exists and I've missed it.
In addition, I'm seeing quite a bit of jQuery.extend() in the profiler as part of the _fnAddData() method. This is taking a bit over 10% of the overall time (along with another .extend() in that each() loop at line 6366 that's eating 10% of the time, but I can't exactly pinpoint where). I'm looking through the source and it appears that the data is being copied so that it can be manipulated and eventually placed into the table without disturbing the original dataset. However, in my particular case, I don't believe that any data is changed. The formatting functions we're using don't directly change the dataset, but return new data based on the original dataset. I suppose it would depend on whether or not there's a clear enough use case to justify it, but would it be possible to simply draw directly from the original dataset without copying the data, perhaps set with a flag or with some checks beforehand?
I could be completely off the rails with these as I am not hugely familiar with the Datatables source and I'm not an expert in Javascript, but I figured they might be worth mentioning. This sort of thing may also be relatively unique to the specific project I'm working on, too.
Another possibility that springs to mind is to break up the table initialization/construction in the Datatables plugin itself with web workers. I would think that there are a number of processes that could be run in parallel. Again, that's just me speculating.
> Would it be possible to pass a separate dataset for the search array?
Yes - you can use mRender / mData to send back filter specific data: http://datatables.net/blog/Orthogonal_data
- .extend() in _fnAddData - the extend has been completely removed in 1.10 and not replaced - the original data is used and not copied as it was in 1.9-.
- Web-workers - yes it is an option, but since most people's browser's don't have support for web-workers, its unlikely to be in DataTables any time soon.
Allan
I've put this little text case together to see what the best way is: http://jsperf.com/html-decode . It currently only uses two methods, jQuery and direct DOM. Obviously the direct DOM is much faster. I've just committed that change into DataTables 1.10.
Allan
I've added to option options to the jsperf test case:
3. Using a bit of regex to remove tags and then decoding via a text box. This is massively faster than DOM / jQuery in Chrome / Safari, and a bit faster in Firefox. It is faster in IE also, but not my much.
4. Using pure regex - a function to decode numbers and another to decode entities. This is very fast in all browsers.
So putting option 4 in is very tempting, but there are 252 named HTML entities (according to wikipedia - http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references ). That would be bad news to have defined in the DataTables source since it takes up a fair amount of space.
I guess one option would be to have a list of some known entities (the common ones) and then fall back to another method if an unknown entity is found (probably just using jQuery in that case since the code required is tiny).
Need to think about this a bit more before committing it in though.
Any thoughts on the subject are very welcome!
Regards,
Allan
I would think that with the considerable speed benefit of regex, keeping a certain number of more common entities would definitely be preferable. One or two dozen will do little to affect the size of a minified plugin. That's just one person's opinion, though.
Thanks for all the help! Between this and the search classes and deferred rendering, the display of the tables has been dramatically improved. It's still a few seconds on large ones, but I think the slow script warning might just go away completely at this point.