Record Export Size Limitation - solution?
Record Export Size Limitation - solution?
tharris
Posts: 15Questions: 0Answers: 0
I've got a web app that has around 2000 records and a dozen columns. After trying to save using TableTools it's pretty obvious I've hit the flash limitation on the amount of data you can process (as discussed in the "TableTools v1.0.2 - Save as Excel, CSV, copy and print!" announcement thread). I've been looking on the forums for solutions (edit- for keeping things client-side, though allen's suggestion on the aforementioned thread about hacking tabletools to post to a server is potentially doable), but haven't seen anything yet. Are there any known work-arounds for this issue?
This discussion has been closed.
Replies
Edit - It's looking increasingly like IE7 is just slow as dirt going through the fnGetDataTablesData function in tabletools. Still, that doesnt explain why it saves empty excel sheets for larger datasets. I've noticed that sometimes it will output the previous set of rows if I filter, export, remove filter, and export again, or just switch from a smaller output to a larger output. Also, if I just restart IE7 and try straight for the output it has 1 cell populated - the contents are "ÿþ" for some reason. Ideas, anyone?
Are you using 1.1.4 now? UTF support wasn't present in 1.0.x.
One thing you will find with saving to XLS (and CSV but to a much latter extent) is that the flash processing is _VERY_ slow. This is because it needs to deal with the strings bit by bit - since this isn't a built in function for flash to do what is needed. So 2000 records is likely to crush most computers when doing an XLS export - sorry to say this!
So I think a server-side option might be quite a good answer for dealing with so many records. I'm working on TableTools 2 at the moment which will allow for a lot more flexibility with buttons, including an option to send the DataTables data to the server-side for processing there (PDF for example, or proper XLS creation, rather than a UTF16-LE CSV file with an XLS extension!). This scripts will need to be created (and won't be part of the TableTools distribution), although I am thinking about setting up a site just for TableTools to promote its flexibility. If you fancy having a look at an early development version, please let me know: http://datatables.net/contact .
Regards,
Allan
The interesting thing that I've noticed so far is that while at the end of fnGetDataTablesData the function returns sData, which when sent to an alert box has all the data for excel, but when you open the saved file you get "" in A1 as the only saved value for .csv files and "
The 'odd' characters that you are seeing are the Byte Order Mark (BOM) which indicated the file's character encoding. CSV is saved as UTF8 and XLS as UTF16-LE. The text editor you are using isn't UTF aware I guess - or the file has been corrupted - it might be something funny going on in flash when using large data sets.
The approach I've taken with TableTools 2 for the server-side option is that the data will be generated and collected on the client-side from DataTables (for taking account of filtering etc), but instead of sending it to an SWF file, it sends it to the server using Ajax. It might mean you've got a large POST request - but hopefully it would be worth it.
Regarding the speed up question - for large data sets, server-side processing really is the way to go. Although how big a 'large' data set is, depends on the browser. I have on problem dealing with many thousands of rows in Safari etc, but IE gets crunched. Processor speed on each machine makes a difference as well. If it's unacceptable in IE7 and that's a key browser for you, I'd say look at server-side processing. DataTables is fairly optimised in how it works already - although if anyone has any suggestions they are fairly welcome, and I'm always looking for ways to improve it myself.
The way DataTables works at the moment is that it reads in the HTML from the page, putting it into it's cache and manipulates the DOM on each draw - adding and removing elements as required. It's sort of 'just in time' for each draw - although the initial read does happen 'online'. It might be possible to take that offline - an interesting thought for the future.
Regards,
Allan
If I had to guess the data is somehow corrupted in flash, though I haven't looked at what tabletools does outside the tabletools.js for 1.1.4. I'm using Excel 2007 to open the saved files. Curiously, the save feature works fine in FireFox even for larger (2000ish rows with around 12-18 columns) datasets, so it may be a quirk unique to IE7 and/or it's version of Flash 10. Since I have no development experience with Flash, I'll leave the speculation as to how, why, or even if this may be to others.
It does look like server-side processing is the way to go. I'm planning on setting up some test code to work out the how-to stuff for getting tabletools to post back to the server, the server to generate excel or a csv file, and then send it back to the user, which hopefully shouldnt be too bad, but I'm not sure how I'll handle the print and copy to clipboard options, since they run into the same problem in fnGetDataTablesData. I might just disable everything but send to excel.
I'm loading the page using AJAX and JSON, so there isn't any html beyond the table headers, though I may try generating some and then initializing the datatable to see how the performance differs at page load. I doubt there will be a big difference though. Thanks for your input!
[code]
<%@ Page Language="C#" AutoEventWireup="true" CodeFile="testTableExcel.aspx.cs" Inherits="testTableExcel" %>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
Test Table - Excel
var oTable;
$(document).ready(function(){
populateTable(10);
TableToolsInit.sSwfPath = "swf/ZeroClipboard.swf";
oTable = $('#tblData').dataTable({
"bJQueryUI": true,
"bAutoWidth": false,
//Make it look decent and change the default sort to column 1
"sPaginationType": "full_numbers",
"iDisplayLength": 25,
"aaSorting": [[ 1, 'asc' ]],
/* Fix formatting issues related to TableTools plugin */
"sDom": 'T<"clear"><"H"lfr>t<"F"ip>'
});});
//Populate #tblData with generic example data from 1 to numRows
function populateTable(numRows){
var rows = "";
for(var i = 1; i
[code]
//server side requires a webmethod "SendToExcel" function at testTableExcel.aspx code behind as well as
//a serializeable Employee object with properties matching the JSON field names. See below for an example.
function sendToServer(){
var sData = [];
var oSettings = oTable.fnSettings();
//Generate JSON string representation
for ( var i=0, iLen=oSettings.aiDisplay.length ; i
The aspx page formats the data associated with the filtered table rows into JSON and makes an AJAX post back to it's code-behind on the server to turn it into standard XML, which is saved as a session variable.
When the AJAX is successfully completed, the success function opens a new window and opens XMLHandler.ashx/ProcessRequest (a generic request handler's ProcessRequest method).
The ProcessRequest function inside XMLHandler.ashx takes the session variable and attaches the various headers before returning the XML as a XLS attachment for excel.
A few potential weaknesses here: you could probably run into problems if you try outputting more than 65536 rows and open in Office 2003 (1,048,576 if you're in 2007). Also, since the current setup is just a generic XML file, office throws up a few messages while it figures out what to do with the data. Finally, you lose the plug-and-play nature of tabletools 1.1.4. On the plus side, Excel 2007 adds column filters and row highlighting automatically, which is nice.
I'm sure there are ways to make this considerably more elegant, but until I see someone post such a solution, this works for now. Load times on IE7 (after the initial table load, which is painful but workable up to a few thousand records) are great - tabletools took a few minutes and failed to output the file, whereas this takes maybe 5 seconds. Still, on browers with faster javascript engines I'd say stick with tabletools and save yourself the trouble - the latest version of firefox (3.6.4, as I recall) takes about the same amount of time with tabletools, and you don't have to muck about getting everything set up. Also, for small datasets the overhead far outweighs the benefits, so if it doesnt take at least 3-5 seconds with tabletools this will just slow you down.
Edit - Just a side note, keep in mind that this doesnt cover all the cases such as for nested arrays or objects (ie, Employee with another object inside - ProjectHistory, for example). I'll post a more general case below.
<%@ Page Language="C#" AutoEventWireup="true" CodeFile="testTableExcel.aspx.cs" Inherits="testTableExcel" %>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
Test Table - Excel
var oTable;
$(document).ready(function(){
populateTable(6000);
TableToolsInit.sSwfPath = "swf/ZeroClipboard.swf";
oTable = $('#tblData').dataTable({
"bJQueryUI": true,
"bAutoWidth": false,
//Make it look decent and change the default sort to column 1
"sPaginationType": "full_numbers",
"iDisplayLength": 25,
"aaSorting": [[ 1, 'asc' ]],
/* Fix formatting issues related to TableTools plugin */
"sDom": 'T<"clear"><"H"lfr>t<"F"ip>'
});
}
);
//Populate #tblData with generic example data from 1 to numRows
function populateTable(numRows){
var rows = "";
for(var i = 1; i
Edit - This is the final version. generateXML is generic - pass in a class and it will generate an xml representation of the class properties for each item in the list (only for initialized properties, which is nice if you only occasionally use some of the properties or want to exclude them by not passing them back in the JSON) and return it as a string. From here just extract generateXML into a shared class and you're good to go.
[code]
using System;
using System.Web;
using System.Web.Services;
using System.Collections.Generic;
using System.Xml;
using System.IO;
using System.Web.Script.Serialization;
using System.Reflection;
public partial class testTableExcel : System.Web.UI.Page
{
protected void Page_Load(object sender, EventArgs e)
{
}
[WebMethod(true)]
public static void SendToExcel(object containers)
{
//extract JSON objects - Employee is a container class with properties matching the JSON
JavaScriptSerializer ser = new JavaScriptSerializer();
List items = ser.Deserialize(containers.ToString());
//convert the JSON into an XML file, then save it to a session variable
HttpContext.Current.Session["XLSOutput"] = generateXML(items);
}
//Takes a List of a generic class and extracts an XML representation of its initialized properties
//Returns XML as a string
public static string generateXML(IEnumerable inputList)
{
//originally based on the example from http://www.primaryobjects.com/CMS/Article76.aspx
using (MemoryStream stream = new MemoryStream())
{
// Create an XML document. Write our specific values into the document.
XmlTextWriter xmlWriter = new XmlTextWriter(stream, System.Text.Encoding.UTF8);
xmlWriter.WriteStartDocument();
xmlWriter.WriteStartElement("Output");
//Extract the list of properties
Type t = typeof(T);
PropertyInfo[] pi = t.GetProperties();
//output XML representation of each class
foreach (T item in inputList)
{
//if the value doesnt exist, skip it. might not catch all edge cases though...
xmlWriter.WriteStartElement(t.Name);
foreach (PropertyInfo p in pi)
{
if (p.GetValue(item, null) != null)
{
xmlWriter.WriteElementString(p.Name, p.GetValue(item, null).ToString());
}
}
xmlWriter.WriteEndElement();
}
xmlWriter.WriteEndElement();
xmlWriter.WriteEndDocument();
// To be safe, flush the document to the memory stream.
xmlWriter.Flush();
//extract the xml from the stream
stream.Position = 0;
StreamReader sr = new StreamReader(stream);
return sr.ReadToEnd();
}
}
}
[/code]
[code]
<%@ WebHandler Language="C#" Class="XMLHandler" %>
using System;
using System.Web;
using System.Web.Script.Serialization;
using System.Collections.Generic;
using System.Xml;
using System.IO;
using System.Web.SessionState;
public class XMLHandler : IHttpHandler, IRequiresSessionState
{
//Output the contents of the session variable XLSOutput as an xls attachment for downloading
//XLSOutput should be updated to match table contents BEFORE this request is made!!!
public void ProcessRequest(HttpContext context)
{
context.Response.ContentType = "xml";
context.Response.AppendHeader("Content-Disposition", "attachment; filename=MyExportedFile.xls");
context.Response.ContentType = "application/octet-stream";
context.Response.ContentEncoding = System.Text.Encoding.UTF8;
context.Response.Write(context.Session["XLSOutput"]);
}
public bool IsReusable
{
get
{
return false;
}
}
}
[/code]
[code]
public static string GenerateXML(IList inputList)
{
//originally based on the example from http://www.primaryobjects.com/CMS/Article76.aspx
using (MemoryStream stream = new MemoryStream())
{
// Create an XML document. Write our specific values into the document.
XmlTextWriter xmlWriter = new XmlTextWriter(stream, System.Text.Encoding.UTF8);
xmlWriter.WriteStartDocument();
xmlWriter.WriteStartElement("Output");
WriteElementList(inputList as IList, xmlWriter);
xmlWriter.WriteEndElement();
xmlWriter.WriteEndDocument();
// To be safe, flush the document to the memory stream.
xmlWriter.Flush();
//extract the xml from the stream
stream.Position = 0;
StreamReader sr = new StreamReader(stream);
return sr.ReadToEnd();
}
}
private static void WriteElementList(IList itemList, XmlTextWriter xmlWriter)
{
Type t = typeof(T);
foreach (T item in itemList)
{
xmlWriter.WriteStartElement(t.Name);
//may not handle arrays. In theory, treat them as lists...
if (typeof(System.Collections.IList).IsAssignableFrom(t))
{
foreach (Type interfaceType in t.GetInterfaces())
{
if (interfaceType.IsGenericType && (interfaceType.GetGenericTypeDefinition() == typeof(IList<>)))
{
WriteElementList(item as IList, xmlWriter); // is a match
}
}
}
else
{
PropertyInfo[] pi = t.GetProperties();
WriteElement(item, pi, xmlWriter);
}
xmlWriter.WriteEndElement();
}
}
private static void WriteElement(T item, PropertyInfo[] pi, XmlTextWriter xmlWriter)
{
//Extract the list of properties
Type t = typeof(T);
foreach (PropertyInfo p in pi)
{
Type t2 = p.PropertyType;
if (p.CanRead && p.GetValue(item, null) != null)
{
//either output a primitive/string/datetime or handle it as an object/list of objects
if (t2.IsPrimitive || t2.UnderlyingSystemType.Name == "String" || t2.UnderlyingSystemType.Name == "DateTime")
{
//don't write if the content is String.Empty
if (!(t2.UnderlyingSystemType.Name == "String" && p.GetValue(item, null).Equals(String.Empty)))
{
xmlWriter.WriteElementString(p.Name, p.GetValue(item, null).ToString());
}
}
else
{
//wrap in a new element tag
xmlWriter.WriteStartElement(t2.Name);
//arrays can be treated similarly, though probably with less safety checks
if (typeof(System.Collections.IList).IsAssignableFrom(t))
{
foreach (Type interfaceType in t.GetInterfaces())
{
if (interfaceType.IsGenericType && (interfaceType.GetGenericTypeDefinition() == typeof(IList<>)))
{
WriteElementList(item as IList, xmlWriter); // is a match
}
}
}
else
{
PropertyInfo[] pi2 = t2.GetProperties();
WriteElement(p.GetValue(item, null), pi2, xmlWriter);
}
xmlWriter.WriteEndElement();
}
}
}
}
[/code]
Regards,
Allan
[code]
var sData = [];
//...
sData.push(string);
//...
return sData.join('');
[/code]
instead of:
[code]
var sData = '';
//...
//...
sData = sData + string;
//...
return sData;
[/code]