Fast Client-Side Filtering
Fast Client-Side Filtering
I am attempting to employ DataTables in a browser-based web application that automatically updates the display in real time with data that arrives on a socket (custom ActiveX control in IE today; hope to use HTML5 WebSockets in future). Multiple updates to individual cells (changes only - not entire rows) arrive at a nominal rate of around 10 every second. For all intents and purposes, the total number of rows in the table is fixed and is in the range of 800 - 1000.
I want to use DataTables for advanced table functionality, especially client-side sorting and filtering. It's acceptable to have sorting happen only "on demand" (user request), but filtering needs to be dynamic. In other words, when filtering to show only those rows whose "Status" column has a certain value, individual rows should automatically be hidden/shown based on their current "Status" value.
At first, I thought DataTables was going to be a great fit and make this really easy, but I quickly discovered that even if I suppress bSort (and bSortClasses!), that there's a lot of overhead in the way DT implements filtering. So I embarked on writing a plug-in to optimize the processing for this specific purpose, only to discover that updating row visibility ... involves destroying/recreating tr and td elements?!? Waaaaaay too much overhead for real-time updates! I had pictured the filtering implementation as simply hiding/showing rows (ala CSS display:none) that stayed put in the DOM.
I had started going down the path of reorganizing the filtering calls to do the whole 3-tier she-bang on a row, and only re-filtering the rows that have changed when I discovered that the whole concept of what filtering actually does is vastly different from what I thought. Do you think it would still be feasible to co-opt the filtering support into doing what I'm looking for? Perhaps there could be a configuration switch that controls whether client-side filtering is accomplished by manipulating the DOM or using jQuery's .hide()/.show() (possibly even with animation). Would that be a huge deal, or could it be accomplished in a fairly straight-forward manner with a plug-in?
I want to use DataTables for advanced table functionality, especially client-side sorting and filtering. It's acceptable to have sorting happen only "on demand" (user request), but filtering needs to be dynamic. In other words, when filtering to show only those rows whose "Status" column has a certain value, individual rows should automatically be hidden/shown based on their current "Status" value.
At first, I thought DataTables was going to be a great fit and make this really easy, but I quickly discovered that even if I suppress bSort (and bSortClasses!), that there's a lot of overhead in the way DT implements filtering. So I embarked on writing a plug-in to optimize the processing for this specific purpose, only to discover that updating row visibility ... involves destroying/recreating tr and td elements?!? Waaaaaay too much overhead for real-time updates! I had pictured the filtering implementation as simply hiding/showing rows (ala CSS display:none) that stayed put in the DOM.
I had started going down the path of reorganizing the filtering calls to do the whole 3-tier she-bang on a row, and only re-filtering the rows that have changed when I discovered that the whole concept of what filtering actually does is vastly different from what I thought. Do you think it would still be feasible to co-opt the filtering support into doing what I'm looking for? Perhaps there could be a configuration switch that controls whether client-side filtering is accomplished by manipulating the DOM or using jQuery's .hide()/.show() (possibly even with animation). Would that be a huge deal, or could it be accomplished in a fairly straight-forward manner with a plug-in?
This discussion has been closed.
Replies
> [filtering] involves destroying/recreating tr and td elements?!?
That's not the case when using client-side processing (it was in DataTables 1.0-1.3, but never since). If you are using server-side processing, then yes, this has to be the case, since the table doesn't doesn't have any knowledge about anything other than what is on the current page - but client-side processing, no, nodes are retained.
> simply hiding/showing rows (ala CSS display:none) that stayed put in the DOM
It would be interesting to see what the speed difference is between inserting and removing child nodes, compared to setting the css property, since either way a reflow and repaint is going to be required. If nodes were created then I can see there being a significant difference, but I would imagine that it isn't that great a difference. Removing nodes is actually quite important from a performance point of view for larger tables - say you have a 100'000 rows - that a lot of TR elements in the DOM if you only have a paging size of 10! Also rows must be removed and re-added for sorting, so it makes sense to use the same mechanism for filtering, rather than taking a double hit.
So moving towards a solution, can you give me more of an idea of what you are looking to do? Are you using client-side processing for example? What does you current update code look like? How fast are you getting data for cells (you say 10s, is that per cell, row or individual data points)?
Allan
I set table-layout:fixed, turn off bPaginate, bAutoWidth, bSort, and bSortClasses, and I call fnDraw() using setTimeout() to yield processing to the browser first. And it still bogs down and locks up the browser (32bit IE9 on 64bit Win7). IE9's profiler tells me that almost 90% of the time spent in _fnDraw() is being spent in calls to appendChild() and removeChild(). Since only the values in one column are changing, dropping and re-creating the rest of each row is pure overhead in my scenario, not to mention walking rows that haven't changed changed at all. The set of rows in the data is fixed, and the number of rows is sufficiently small that it is practical to load them all on the client. Given all this, I am quite sure that hiding/showing individual rows would involve substantially fewer DOM calls and perform much better.
It isn't recreating DOM elements. The elements are removed, but not destroyed. They are held onto in Javascript variables so the same nodes can be reinserted into the DOM.
Your updates might effect filtering and or sorting (albeit that you've got sorting disabled, so not in this case), so DataTables needs to do a full draw, and that draw will always remove the TR elements in the TBODY and then readd them in the order needed.
Where this could be optimised is that the order and filtering might not be changed and thus a draw would not be needed, or a single node might just need replaced with a different one etc. That is certainly an area that DataTables could use some considerable optimisation on and I will look at that for future versions.
So yes, without doubt the draw method could be optimised significantly for this use case, where as at the moment it is very general to cope with everything that gets thrown at DataTables.
Allan
Added to the list :-)
Thanks,
Allan