Excel export unusable for tables with large datasets?

Excel export unusable for tables with large datasets?

daveslabdaveslab Posts: 40Questions: 14Answers: 0

I use the excel export option in a project for tables that can have a lot of data (near 10,000 rows). Datatables performs well, but after I upgraded to Datatables Editor 1.6, when I try to use the Excel export option, the process takes forever. For a table with 1,300ish rows, it takes almost two minutes to export now whereas before it would only take a second or two (see attached screenshot).

I ran a couple of profiles tests on it, and I narrowed down the problem to a regular expression in the function _addToZip. Here is the line:

str = str.replace( /<(.*?) xmlns=""(.*?)>/g, '<$1 $2>' );

Two questions:

  1. I changed the line to be as follows and the execution time went down to one second as it used to be. Could this work as a substitute line? (I am not at all familiar with the XLSX spec).:
str = str.replace(' xmlns="" ', '');
  1. Is this line really necessary (i.e. are the Excel files broken if it's not there)? I look in the previous version of DTE and it wasn't there and our excel files worked fine at the time.

Thanks for all your help and your time!

This question has an accepted answers - jump to answer

Answers

  • daveslabdaveslab Posts: 40Questions: 14Answers: 0

    Just a couple of quick finds:

    • The substitute line works in Chrome but not in Firefox for some reason. The Excel files come out corrupted in Firefox.
    • If I delete the line, in Firefox, the files come out empty but in Chrome it works.
  • allanallan Posts: 63,489Questions: 1Answers: 10,468 Site admin
    Answer ✓

    1) Yes the line could be changed. The only danger is if the data that should be included in the output contains that string it would silently removed! Amazing that that change in the regex made such a massive difference though and worth considering for that reason alone.

    2) Yes, it is needed in IE and Safari. Excel will report that the files are broken if it is removed. Its annoying that the IE and Safari XML serialisers decide to add that.

    The previous version of Buttons (prior to Buttons 1.2) didn't use XML nodes to create the files - they just did string manipulation. The newer node approach makes it more flexible for styling etc, but it does add this XML complication.

    Allan

  • daveslabdaveslab Posts: 40Questions: 14Answers: 0

    Thanks for your response, Allan. I replaced the line with this:

        
    str = str.replace(/ xmlns=""/g, '');
    

    And the old speed is back. In the future, it might be worth it to test for the value xmlns="" first within the cells (which is almost never the case I imagine) and then use the faster regex. If I have time, I'll try to run some tests and let you know what I find.

  • allanallan Posts: 63,489Questions: 1Answers: 10,468 Site admin

    which is almost never the case I imagine

    I agree. I was just bothered that it might not always be the case and corrupt data.

    Good to hear that that change is a decent speed improvement as well. That is probably a worthwhile change.

    Allan

  • allanallan Posts: 63,489Questions: 1Answers: 10,468 Site admin

    Little update on this one - there was a pull request with a related change. It uses a tighter regular expression, and I'm inclined to go with that at the moment rather than the looser approach we discussed here.

    If there are still performance issues though, then it will be worth looking into, but I think the main performance now will come from DataTables' _unique() method.

    Allan

  • daveslabdaveslab Posts: 40Questions: 14Answers: 0

    Thanks for the update!

  • allanallan Posts: 63,489Questions: 1Answers: 10,468 Site admin

    One more update - I've more or less resolved the performance issue with _unique() I think. The export of data should be really quite fast now. Going to package up the releases shortly.

    Allan

This discussion has been closed.