Why the excel size is very large when we export using buttons excel option?

Why the excel size is very large when we export using buttons excel option?

saravansaravan Posts: 10Questions: 1Answers: 0
edited December 2016 in Free community support
  1. I've a data set of ~60,000 records with 20 columns.
  2. I'm using buttons export using Html5Excel. Also, i'm using JSZip as well.
  3. When i export the excel, the excel size is approximately 54 MB. However, when i open this excel and save as the file size is around 7 MB which is around 12 times lesser than the original downloaded file.
  4. Is there any way we can reduce the downloaded file size?
  5. Its taking around 1 minute to download in Chrome. However, this is completely failing in IE 11 and Edge browsers. Is there any alternates to download this much big file in IE 11 and Edge, as currently for our customers we can't suggest to use Chrome browser.

Answers

  • allanallan Posts: 63,836Questions: 1Answers: 10,518 Site admin
    Answer ✓

    Do you have a lot of repeated strings in your file? The Excel export from Buttons doesn't make use of Excel's ability to share strings between cells (which can significantly reduce the file size if there is a lot of repeated data). When you save it in Excel it will rewrite it using its own output parser which will use the shared strings.

    Is there any way we can reduce the downloaded file size?

    The sharing string aspect of the XSLX file format would need to be implemented.

    Its taking around 1 minute to download in Chrome. However, this is completely failing in IE 11 and Edge browsers.

    I would suggest profiling it in Chrome and seeing what is taking so long and if that code can be optimised. However, for such a large data set, it might be that you need to use a server-side library to get decent performance rather than building it on the client-side.

    Allan

  • saravansaravan Posts: 10Questions: 1Answers: 0

    Thanks so much Allan for the quick response. You are absolutely right. I've lot of repeated data in my excel.

    Any pointers on implementing sharing string aspect of the XLSX file format on client-side? Also, any idea whether this is part of JSZip or JS-XLSX implementations?

  • allanallan Posts: 63,836Questions: 1Answers: 10,518 Site admin
    Answer ✓

    Neither I'm afraid. Buttons has code that creates the XSLX itself rather than using SheetJS. SheetJS's XSLX writing wasn't really functional when I original wrote Buttons. Its probably time for me to go back and look at providing an option for it.

    The way I'd approach this with the current Buttons code is to create a simple XSLX in Excel and then extract it so you can see how it uses the shared string. Its really just a case of adding an extra XML file to the structure and then referencing the strings as required in the cell elements.

    I did think about putting it into 1.2 for compatibility with OpenOffice, but it turned out OpenOffice didn't need it (it needed something else - the cell references), so it never got down I'm afraid.

    One thing I would say is that if you are concerned about performance, an implementation that uses the string values as the key in an object might be best. Trade memory for performance - a look up object rather than a loop.

    Allan

  • saravansaravan Posts: 10Questions: 1Answers: 0

    Allan, Do you've any time frame in your mind for using SheetJS implementation in future versions of Buttons? Just curious to know :)

    I could see current SheetJS [0.8.0] implementation have features including shared string table etc., Also, it seems this is functional now.

  • allanallan Posts: 63,836Questions: 1Answers: 10,518 Site admin
    Answer ✓

    No time frame at the moment I'm afraid. On the wish list!

    Allan

  • saravansaravan Posts: 10Questions: 1Answers: 0

    Thank you Allan!

  • saravansaravan Posts: 10Questions: 1Answers: 0
    edited December 2016

    Hi Allan, This is being tracked here (https://github.com/Stuk/jszip/issues/387) JSZip discussions as well.

    After changing the compression type to DEFLATE, dramatically the size is reduced down to ~10 times. However, I am still having issues with IE 11 and Edge browsers. When I did the profiling the vast majority of the time (~98%) is spent in addRow and _createNode methods of the excelHtml5 action method.

    Do you've any suggestions to improve performance in tweaking these method calls to make this faster?

  • allanallan Posts: 63,836Questions: 1Answers: 10,518 Site admin
    Answer ✓

    Not immediately I'm afraid, or I would have put them in already :smile:. Do you have a link to the page so I can try to profile it myself and take a look into some experiments to see what might be done to increase performance? Cloning nodes rather than creating new ones might be one option for example.

    Allan

  • saravansaravan Posts: 10Questions: 1Answers: 0

    Unfortunately, I can't share any link at the moment. However, I've attached the profiler snap shot which was taken on Edge browser.

    Can you analyze anything by taking a look at this?

  • allanallan Posts: 63,836Questions: 1Answers: 10,518 Site admin
    Answer ✓

    In Chrome, if you take a CPU profile, it will let you save that profile and upload it (you'll probably need to zip it before this forum will let you upload it). That would let me explore it a little.

    Allan

  • saravansaravan Posts: 10Questions: 1Answers: 0

    Allan, attached the chrome CPU profile. Can you please take a look at this?

  • allanallan Posts: 63,836Questions: 1Answers: 10,518 Site admin

    Thanks! I'll take a look in the morning.

    Allan

  • saravansaravan Posts: 10Questions: 1Answers: 0

    Allan,

    1. I created a link ((http://live.datatables.net/najokiwa/1/edit)) for demonstrating the issue.
    2. For testing purpose, I've used intentionally buttons 1.2.2 [With this version, I've lost all the formatting (number formatting, currency formatting etc.,) in excel]. When I use 1.2.3 the performance is very poor when we do export. However, formatting looks better with 1.2.3
    3. You may please try this in IE11/Edge browsers. For testing purpose, I've cut down the data to 15k records. However, this is still failing to open the excel when we click on excel button.
  • allanallan Posts: 63,836Questions: 1Answers: 10,518 Site admin
    Answer ✓

    Thanks! The profile shows that it took 5.16S for Chrome on my machine to output the Excel file. There doesn't appear to be any one individual function that took a shockingly long time - rather it is the iteration of the data that appears to be taking such a long time.

    1.2.3 and 1.2.4 will take a bit longer because they add the formatting options, and there need to be a conditional check to see if that formatting should be applied or not, which of course takes more clock cycles.

    This function is the one that is taking the longest (~800mS) - most likely the decodeEntities option.

    Yes, disabling it helps, although you'd only do that if you have no HTML entities (which your data doesn't appear to). That will probably really help in IE.

    Then the majority of the time is in the createElement statement.

    JSPerf is down at the moment (which is frustrating because there was other stuff I needed it for!) so I'll come back and experiment a bit with this.

    Allan

  • saravansaravan Posts: 10Questions: 1Answers: 0

    Hi Allan, thank you for the pointers. I tried with disabling decodeEntities in Edge browser. However, this is still taking more than 7 - 8 minutes to open in Edge. It seems there is still something needs to be done to make this faster in IE/Edge browsers.

  • allanallan Posts: 63,836Questions: 1Answers: 10,518 Site admin

    I doubt the performance will ever be acceptable in IE for such a large data set. It its 8 minutes at the moment, even if I improve the performance to make it 10x faster, it would still take almost a minute to complete, which isn't good enough. And I very seriously doubt I can improve it that much I'm afraid.

    Allan

  • saravansaravan Posts: 10Questions: 1Answers: 0
    edited December 2016

    Hi Allan, for our customers we don't have the option of using Chrome for now. So, it would be really great if this works in IE/Edge in a minutes time for 15k records export [with number, currency formatting of the data].

  • allanallan Posts: 63,836Questions: 1Answers: 10,518 Site admin

    I just don't see how I can improve it that much I'm afraid. Even if the cloneNode is twice as fast as the createElement, it would still a minimum of 4 minutes, based on the numbers you give.

    I think the only way you'll get acceptable performance is by creating the file on the server-side.

    Allan

This discussion has been closed.