Performance issues caused by sorting and paging (server-side)

Performance issues caused by sorting and paging (server-side)

mihomesmihomes Posts: 165Questions: 23Answers: 0

Using server side with an innodb database. I have found that when there are large amounts of rows in the result set (lets say 200k and more) that there is a serious performance issue caused by sorting and paging (order by and limit). The problem arises due to the fact the optimizer must read x rows to perform the operation.

Say I am sorting by a timestamp column (or any column for that matter)... an order by asc/desc is added to the query, but it must read all the rows to return them in the correct order.

The same applies for paging... say I am displaying 25 rows... once I get into large page numbers it must read all the rows to get to the starting point. For instance, showing 100,001 to 100,025 would require the first 100,025 rows be read by the query.

Indexes are set properly, but in my case results are returned through a number of table joins (using pdo and foreign key relationships to join tables). Performing the same query outside of datatables so I can remove the order by clause shows results going to upwards of ~6 secs down to ~.05 sec. Of course, the limit clause also

I'm not sure if anyone else has run into this situation or not as by default datatables does not support joins with its server-side script or maybe even without joins and a large database someone has run into something similar? If so, did you have any solution?

Answers

  • allanallan Posts: 63,712Questions: 1Answers: 10,502 Site admin

    Hi,

    You are absolutely correct - in order to be able to do a sort the full data set is required. However, the database engine should abstract that away and you wouldn't see it in PHP (or whatever language you are using), and it is really up to the database to optimise it.

    I don't really see what option there would be to speed things up since the full data set is needed, other than to disable sorting. You might want to look into optimising the database such as increasing its cache size, a result cache possibly and also using the ANALYZE keyword with the generated query to see what the engine is up to.

    All I can say beyond that is that I've seen DataTables server-side processing being used with millions or rows with joined data before without an issue, but I don't know what kind of servers those cases were running on.

    Allan

  • mihomesmihomes Posts: 165Questions: 23Answers: 0

    Yep, I did all those Allan. It is simply a matter of the paging and order by combined when there is a large result set - it really does not matter how many you are actually returning (it could be 10 per page even) the optimizer on MySQL's end still needs to scan x amount of rows to find that result set.

    I have found a few workarounds on the web, however, they do not necessarily apply when joins are involved.

    One other thing to note is if you plan to allow sorting on multiple columns you essentially need a covering index to do so. When using a multi-column index (column1, column2) this would not be used in a case where I wanted to sort column2, column1. As a user might sort in numerous ways the only solution is to have a covering index on the table which covers every column that is sortable... even then the performance is affected.

    In my solution I have a start and end date filter to limit results between two dates... I have found that using this and cutting down the result set speeds things up, but best case scenario would be for the performance to be the same for all (the default of all dates).

    I've been reading all week, but have not found anything yet which has changed the performance on this.

    I should note... have tables with millions and millions of rows would be super fast... if there was no sorting being used. If anyone has come across the same situation and found a work around or something I may have missed I would be grateful.

  • allanallan Posts: 63,712Questions: 1Answers: 10,502 Site admin

    To be honest, you'd be better posting on a MySQL forum rather than here, since that is a question for experts in the field, rather than a DataTables specific question (although DataTables is the application).

    Allan

This discussion has been closed.