One or more valid search results omitted

One or more valid search results omitted

uapbarabeuapbarabe Posts: 8Questions: 1Answers: 0
edited June 2015 in Free community support

Hello everyone,

I work for The University of Arizona Graduate College. We've been using DataTables as a core component of our Graduate Program Descriptions website for more than a year now and have been really pleased with the positive feedback we've received.

http://grad.arizona.edu/programs/

It recently came to my attention that at least one program is omitted from search results, though for no obvious reason. "East Asian Studies (MA)" appears on page 5 when the paging default is set to 25. However, searching for "East Asian", only the PhD and another program appear (the other program has that phrase in some data hidden using CSS). The MA program also does not appear when searching for just "East" or just "Asian". I don't see any errors or warning in the javascript console (have tried Firefox and Chromium, both native & FireBug consoles), and the markup checks out clean using the W3 HTML validator.

In case it helps, I've created a Fiddle that weeds out a lot of superfluous markup & javascript:
https://jsfiddle.net/uapbarabe/j6ouj8e9/13/

I've also tried to use the DataTables debugger, but get a message that "A JSON parsing error occurred. Please report to support with a link."

Any help in the right direction would be greatly appreciated.

Thanks,
Patrick

Answers

  • uapbarabeuapbarabe Posts: 8Questions: 1Answers: 0
    edited June 2015

    On a whim, I tried rendering the page with the hidden "Description" column completely omitted from the markup. For some reason, this allows the record in question to appear in search results. So the issue may be related to the markup, but I'm still having no luck debugging or identifying the actual problem.

    Here's a fork of the original JsFiddle to demonstrate:
    https://jsfiddle.net/uapbarabe/73pkwras/4/

  • tangerinetangerine Posts: 3,365Questions: 39Answers: 395
    edited June 2015

    I may be misunderstanding something here, but the nature of your data makes unexpected search results inevitable - unless you and your users have additional information which is not obvious to me.

    For instance, your table has a hidden "keywords" column; due to that, a search for "east" returns (among others) a row comprising the following:

    International Security (MA) | MA | School of Government and Public Policy | College of Social & Behavioral Sciences | Main UA Online

    in which "east" does not appear anywhere. Is that really the desired result? I see your placeholder includes "keyword", so perhaps that result is intended. But personally I would be expecting my search term to be visible in the filtered result.

    Also, DataTables filter can be explicitly told to ignore HTML markup in td cells:

      { "type": "html", "targets": 0 }
    

    which may be worth trying (although I think that's the default position anyway).

    Finally, you do realise that hidden columns are searchable unless specified otherwise?

  • uapbarabeuapbarabe Posts: 8Questions: 1Answers: 0

    Hi tangerine,

    Thanks for the reply. What you describe is deliberate. The hidden columns are intended to be searchable, as this allows discovery by keyword search, including the keyword column and in the description column, for topic or subjects that are not in the actual program name, but without cluttering the visible table.

    If it helps to put this in the broader context, one of the key audiences of this website is prospective students who are shopping for a graduate program. We want to help them discover programs of study not just by program name, but also by topical or research interests that may not be explicit in a program's name.

  • tangerinetangerine Posts: 3,365Questions: 39Answers: 395

    Hi uapbarabe.
    I began to suspect I wasn't much help. :-)

    I realised that nothing I said was relevant to your not finding data in some circumstances. I have no answer for that, I'm afraid. However, my own search for "East Asian" yields the results ( "Program" column only)

    East Asian Studies (MA)

    East Asian Studies (PHD)

    which seems not to tally with your own result. This was with Firefox, if that's relevant.

    Thanks for the broader context. It makes more sense now.

  • uapbarabeuapbarabe Posts: 8Questions: 1Answers: 0
    edited June 2015

    Hi tangerine - did you see "East Asian Studies (MA)" in your search results using the second JsFiddle example, or was it in the first JsFiddle and/or the live website. If it was either the live site or the first fiddle, then that's really perplexing. I've confirmed the problem behavior in Firefox and Google Chrome on both Ubuntu linux & Windows 8.1, as well as Internet Explorer 11.

    The second JsFiddle example has the hidden "description" column completely removed from the markup, and does show the EAS MA in search results - I'm just not sure why, as I don't see anything unusual about that particular cell's content in the live site, and other programs that have multiple degree options (eg. an MA/MS and a PhD) work as expected. Try searching for "psychology" for example - you'll see master's & PhD records for several programs that otherwise have the same name.

  • tangerinetangerine Posts: 3,365Questions: 39Answers: 395

    Further apologies - I'm not getting much right today!

    It was actually your second fiddle which produced the result I mentioned above. So no help there then.

    However, on a tangential issue - you might want to look at your keywords again. I thought I was getting different results between fiddle 1 and your live site, but it turned out I searched for "east asian" in one and "east asia" in the other. I would suggest that you want "asia" and "asian" to be synonymous.

  • uapbarabeuapbarabe Posts: 8Questions: 1Answers: 0

    Thanks for the clarification - and darn, I thought maybe you'd found a clue :)

    I agree with you regarding keywords. Unfortunately we have to delegate authorship of these program descriptions to their respective academic units (all 140ish of them), so we inevitably end up with descriptions that are rather variable in their thoroughness. If I ever have a budget for more staff, I may be able to consider having this content curated a bit more in-house, but for the time being that's out of scope.

    Thanks again for your input!

  • tangerinetangerine Posts: 3,365Questions: 39Answers: 395

    You're welcome - sorry I wasn't more use.

  • uapbarabeuapbarabe Posts: 8Questions: 1Answers: 0

    No worries. I've been scratching my head on this problem in my "spare" time for a while. It's pretty low on a long list of competing priorities, but thought I'd take advantage of a lull in the storm to see if anyone in the DT community might notice something I haven't.

  • uapbarabeuapbarabe Posts: 8Questions: 1Answers: 0
    edited December 2015

    We recently discovered the source of this issue, so I thought I'd follow up here. Turns out there were some u2028 line ending characters in the contributed markup for the East Asian Studies Master's program description. These show up as little red "gremlins" on line 4629 of the HTML source of my JsFiddle. We don't know the exact origin of those characters in the markup, but they escaped our fairly standard sanitization routines, and are apparently problematic when combined with javascript. There are several threads on this topic in the top google search results for the u2028 character. So in the end this was essentially a markup + javascript issue and not specifically related to DataTables.

This discussion has been closed.