hints for a better "search experience" with server side processing

hints for a better "search experience" with server side processing

tefdattefdat Posts: 42Questions: 7Answers: 3
edited April 2020 in Free community support

Dear community,
i benefit so much from this forum. Time to give something back.

With the default settings in the SQL helper functions (ssp.class.php) the search experience is pretty poor compared
with the great smart search on client side processing.

Just want to share some hints:
* create a fulltext index in your SQL table. Edit the SQL executions and add boolean mode in ssp.class.php. Tried to explain it here roughly.
* modificate the search string in ssp.class.php per Regex

            $str = $request['search']['value'];
            //remove leading boolean operators and the trailing last character, if they are just typed in and "alone" with any trailing any other characters
            //this is primary for avoiding SQL/json fault messages
            //e.g. "+search1 +" --> "+search" 
            //e.g. "-search1 + > ( +search2 <" --> "-search1 +search2" 
            //https://stackoverflow.com/questions/7660545/delimiter-must-not-be-alphanumeric-or-backslash-and-preg-match#comment104798716_7660545
            //https://www.phpliveregex.com/p/vzy#tab-preg-replace
            $str = preg_replace('/[\+\-\>\<\(\)\~][^\w]|\+$/', '', $str);
            //$str2 = preg_replace('/\b.{1,3}\b\s?/', '', $str);

            //add a wildcard to the end of each word with a length of more than 3
            //e.g. "it is more than a search" -> "it is more* than* a search*"
            //https://stackoverflow.com/a/40614597
            //https://www.phpliveregex.com/p/vzz#tab-preg-replace
            $str = preg_replace('/\b\w{3,}\b/', '$0*', $str);

            //add a leading '+' to  all search terms without a leading boolean
            //e.g. "+search1 -search2 search3 ~search4" -> "+search1 -search2 +search3 ~search4"
            //remark; this will convert the search into a boolean AND - a boolean OR will not anymore
            //https://www.phpliveregex.com/p/vzF
            $str = preg_replace('/([^\+\-\>\<\(\)\~])(\b\w+\b)/', ' +$2 ', $str);

            $binding = self::bind( $bindings, '%'.$str.'%', PDO::PARAM_STR );
            $globalSearch[] = "match(myFullTextColumn) against (".$binding." in boolean mode)"; 

With this adaptation you will almost get the same (smart) search as like with client side processing.
The only thing i was not able to was figuring out how to operate with a leading wildcard *mysearch.
I am afraid, this is not been supported by the fulltext search in SQL.

Regards..

Replies

  • colincolin Posts: 15,240Questions: 1Answers: 2,599

    Nice, thanks for sharing!

    Colin

  • tefdattefdat Posts: 42Questions: 7Answers: 3

    Hi,
    i spent a lot of time to get a similar smart search with server side includes.
    I believe, that I found a pretty good workaround for using with SQL databases.

    At first:
    * I join all columns into one additional "FTS_Col" (Powershell Script)
    * I reverse the content with a simple SQL command into another column "FTS_REV'

    update s2l.brs set fts_reverse = Reverse(fts_obverse); 
    
    • i build a full text index with this two columns
    Alter table s2l.brs add FULLTEXT Index fts(fts_obverse, fts_reverse);  
    

    With the reverse hack and the concenated columns fooba% and %oobar will be found. Search is pretty fast. My DB do has approx 1 Mio Rows with about 20 columns. Search takes <1sec.
    Hence there is sometimes the need to search for %ooba% i add the the fulltext search, which can be enabled by an leading '|' in the search pattern. This is a real fulltext search and takes up to 10seconds and only necessary for very special searchs.
    I colorize as the input field appropiately:

    $(".dataTables_filter input")
                    .unbind() // Unbind previous default bindings
                    .bind("keyup input", function(e) { // Bind our desired behavior
                        // define the beginning string for a non-fts search through the database
                        var SqlLikeString = "|";
                        var str_detected;
                        // check, if the entered search string begins with this string
                        if(this.value.lastIndexOf(SqlLikeString, 0) === 0){ //https://stackoverflow.com/a/4579228
                            //if yes, save this value and make the BG red
                            str_detected = true;
                            $(this).css('background-color', '#C2561A'); //https://stackoverflow.com/a/2001383
                        }
                        else {
                            str_detected = false;
                            $(this).css('background-color', '');
                        }
    

    The rest is done with the function "filter" from the ssp.class.php:


    static function filter ( $request, $columns, &$bindings ) { $globalSearch = array(); $columnSearch = array(); $dtColumns = self::pluck( $columns, 'dt' ); if ( isset($request['search']) && $request['search']['value'] != '' ) { //RegEx Tests: http://sandbox.onlinephpfunctions.com/code/f35ac9bf7f7bc83cdd6e4167c305c6d7a06b86c9 //http://sandbox.onlinephpfunctions.com/code/7f4f4306f00b25c6f166aa0ac3a321b4d20fb569 $col_fts_nor = "fts_obverse"; $col_fts_rev = "fts_reverse"; $col_fts = "fts"; $str = $request['search']['value']; //check which search method is been requested if (strpos($str, '|') === 0) { $withlike = true; } else { $withlike = false; } //remove all '*', '+', '(', ')' as first step //$str = preg_replace('/[\*\+\(\)]{1,}/', '', $str); //remove '-' if its the last char //$str = preg_replace('/\-$/', '', $str); //remove standalone boolean operators //remove last character, if they are just typed in and "alone" with any trailing any other characters //e.g. "+search1 +" --> "+search" //e.g. "-search1 + > ( +search2 <" --> "-search1 +search2" //https://stackoverflow.com/a/7660574 https://www.phpliveregex.com/p/vzy#tab-preg-replace //$str = preg_replace('/[\+\-\>\<\(\)\~][^\w]|\+$/', ' ', $str); //https://stackoverflow.com/questions/26507087/escape-string-for-use-in-mysql-fulltext-search //remove everythin except alphanumeric $str = preg_replace('/[^\p{L}\p{N}_]+/u', ' ', $str); //remove trailing and leading \s $str = preg_replace('/^\s|\s$/', '', $str); //if containt the like string $counter = 0; $str_db = ""; $str_array = explode(" ", $str); //$message = "wrong answer"; //echo "<script type='text/javascript'>alert('$message');</script>"; if ($withlike) { //convert string to array //loop through the array foreach($str_array AS $for_each) { //ignore first - its the '|' char //the first search is different, we dont need an AND on the beginning if ($counter == 0){ //create the select command "(COLUMN like '%SEARCHSTRING%')" $str_db = "(" . $col_fts_nor . " like '%" . $for_each . "%') "; } elseif ($counter >= 1) { //continue buildind the SQL query $str_db = $str_db . "AND " . "(" . $col_fts_nor . " like '%" . $for_each . "%') "; } $counter++; } } else { //add a wildcard to the end of each word with a length of more than 3 //e.g. "it is more than a search" -> "it is more* than* a search*" //https://stackoverflow.com/a/40614597 https://www.phpliveregex.com/p/vzz#tab-preg-replace //$str = preg_replace('/\b\w{3,}\b/', '$0*', $str); //add a leading '+' to all search terms without a leading boolean //e.g. "+search1 -search2 search3 ~search4" -> "+search1 -search2 +search3 ~search4" //https://www.phpliveregex.com/p/vzF //$str = preg_replace('/([^\+\-\>\<\(\)\~])(\b\w+\b)/', ' +$2', $str); //replace '*-' - this is been made in the lines above //$str = preg_replace('/\*\-/', '-', $str); for($i=0; $i<count($str_array); $i++) { $str_nor = $str_array[$i]; $str_rev = strrev($str_nor); //create the reversed search string //first scan if ($i == 0){ $str_db = "Match(" . $col_fts_nor . ", " . $col_fts_rev . ") Against (\"+(" . $str_nor . "* " . $str_rev . "*)"; } else{ $str_db = $str_db . " +(" . $str_nor . "* " . $str_rev . "*)"; } //last scan if ($i+1 == count($str_array)) { $str_db = $str_db . "\" in boolean mode)"; } } } $globalSearch[] = $str_db;

    Today I purchased the Editor (for supporting Allen & respectively using SearchPanes) and figured out unfortunately, that the whole lib is different.
    Any hints where to begin and add this hack are mostly welcome :)

    Regards :smile:

  • allanallan Posts: 63,761Questions: 1Answers: 10,510 Site admin

    Hi,

    This is where Editor's PHP libraries do its filtering for server-side processing. That would be the place to customise how it does it.

    Allan

  • tefdattefdat Posts: 42Questions: 7Answers: 3

    Hi,
    thank you very much.
    Struggled the whole day (i am such a lamer with PHP and objects)..
    Finally found a workaround. If I would be better, I would make a customized where, where I would give the parameters (FTS index name) etc as parameter..
    Its working right now .. jippie :)


  • colincolin Posts: 15,240Questions: 1Answers: 2,599

    Excellent, thanks for reporting back,

    Colin

  • tefdattefdat Posts: 42Questions: 7Answers: 3

    Oh yes - It takes a load off my mind.
    I build a kind of (leisure time) knowledge base based with the content of our CRM (working for pharmaceutical company, CRM is salesforce). Got the data from the API. Currently my DB contains "just" 1Mio cases for 7 countries.
    Whole (smart) search takes 600ms. Going to expand it to global cases with approx 10Mio cases. Im am curios, how good this will runs..by the way: I love datatables :)

  • tefdattefdat Posts: 42Questions: 7Answers: 3
    edited August 2020

    So, may I share the latest workingedition, where boolean conditions as like AND, OR and NOT has been added to both search styles. This is a enormous enhancement and improvement of the default search; you can use indexed tables (if you made some little preparation) with a very fast search - where the result is comparable as like with the smartsearch of the non-serverside datatables search.

    Can be tested in:
    https://phpsandbox.io/n/raspy-heart-3rgr (see line 30/31)

    search for: 'dog cat mouse NOT house NOT car OR computer'
    result:

    Match(fts_obverse, fts_reverse) Against (" +(dog* god*) +(cat* tac*) +(mouse* esuom*) -(house* esuoh*) -(car* rac*) (computer* retupmoc*)" in boolean mode)
    

    FTS search:
    search for: '|dog cat mouse NOT house NOT car OR computer'
    result:

    (fts_obverse like '%dog%') AND (fts_obverse like '%cat%') AND (fts_obverse like '%mouse%') AND (fts_obverse not like 'house%') AND (fts_obverse not like 'car%') OR (fts_obverse like '%computer%')
    

    (the rest of sql query is been added by the DT PHP libs which ends with 'WHERE')

    <?php
    /**
     * builds an sql query for datatables.
     *
     * custom routine for building a SQL like / match sql querie based on the search string
     *
     *  @author     tefdat
     *  @link       https://datatables.net/forums/discussion/61483/hints-for-a-better-search-experience-with-server-side-processing#latest
     */
    
    //define FTS columns
    $fts_obverse = 'fts_obverse';
    $fts_reverse = 'fts_reverse';
    
    //declartion and initialisation variables
    $counter = 0;
    $realFTS = $was_or = $was_not = false;
    $str_db = "";
    
    
    /***
     *      _____   ___   ____  ____      __  __ __ 
     *     / ___/  /  _] /    ||    \    /  ]|  |  |
     *    (   \_  /  [_ |  o  ||  D  )  /  / |  |  |
     *     \__  ||    _]|     ||    /  /  /  |  _  |
     *     /  \ ||   [_ |  _  ||    \ /   \_ |  |  |
     *     \    ||     ||  |  ||  .  \\     ||  |  |
     *      \___||_____||__|__||__|\_| \____||__|__|
     */
    //$query_str = $http['search']['value'];
    $query_str = 'OR a AND b OR c NOT d OR';
    
    //check which search method is been requested
    $realFTS = (strpos($query_str, '|') === 0) ? true : false;
    
    //remove everythin except alphanumeric
    $query_str = preg_replace('/[^\p{L}\p{N}_]+/u', ' ', $query_str);
    //remove trailing and leading \s
    $query_str = preg_replace('/^\s|\s$/', '', $query_str);
    
    //convert string to array
    $str_array = explode(" ", $query_str);
    
    //if we do have a FTS search (search in FTS column instead of in the index)
    if ($realFTS)
    {
        //loop through the array
        foreach ($str_array AS $for_each)
        {
            //ignore first - its the '|' char - this has been stripped with the former regex
            //we dont need an AND on the beginning
            if ($counter == 0)
            {
                //create the select command "(COLUMN like '%SEARCHSTRING%')"
                $str_db = "(" . $fts_obverse . " like '%" . $for_each . "%') ";
            }
            //continue building search pattern
            elseif ($counter >= 1)
            {
                //check if there are special boolean operators e.g. OR
                if ($for_each === 'OR')
                {
                    $str_db = $str_db . "OR ";
                    $was_or = true;
                    continue; //save & skip, if there was an OR
                }
                //check if there is a NOT
                elseif ($for_each === 'NOT')
                {
                    $was_not = true;
                    continue; //save & skip, if there was an OR
                }
                elseif (!$was_or && !$was_not)
                { //dont add an AND if there was an OR before
                    $str_db = $str_db . "AND ";
                }
                //continue buildind the SQL query
                if ($was_not)
                {
                    $str_db = $str_db . "AND " . "(" . $fts_obverse . " not like '%" . $for_each . "%') ";
                }
                else
                {
                    $str_db = $str_db . "(" . $fts_obverse . " like '%" . $for_each . "%') ";
                }
                //reset values
                $was_or = false;
                $was_not = false;
            }
            $counter++;
        }
    }
    else //default; for the search the FTS index will be used
    {
        for ($i = 0;$i < count($str_array);$i++) //loop through the array
        {
            $str_nor = $str_array[$i]; //store value for later usage
            $str_rev = strrev($str_array[$i]); //create the reversed search string
            $found_op = false; //local variable/flag
            $last = ($i + 1 == count($str_array)) ? true:false; //eval. if its the last scan/run in the loop
    
            if ($i == 0) //first scan
            {
                //create the leading part of the SQL query
                $str_db = 'Match(' . $fts_obverse . ', ' . $fts_reverse . ') Against ("';
            }
    
            if (!$last) //if it's not the last scan
            {
                //local variables with the (reversed) content of the next array element
                $str_nor_next = $str_array[$i+1];
                $str_rev_next = strrev($str_array[$i+1]);
                switch($str_nor){
                    case 'AND': //if it's a AND, ignore AND and build corresponding string with next variable
                        $str_db = $str_db . " +(" . $str_nor_next . "* " . $str_rev_next . "*)";
                        $found_op = true;
                        break;
                    case 'NOT': //if it's a NOT, ignore NOT and build corresponding string with next variable
                        $str_db = $str_db . " -(" . $str_nor_next . "* " . $str_rev_next . "*)";
                        $found_op = true;
                        break;
                    case 'OR': //if it's a OR, ignore OR and build corresponding string with next variable
                        $str_db = $str_db . " (" . $str_nor_next . "* " . $str_rev_next . "*)";
                        $found_op = true;
                        break;
                    default: //if it's non of the above - build the default string (=AND)
                        $str_db = $str_db . " +(" . $str_nor . "* " . $str_rev . "*)";
                        break;
                }
            } // ignore last OPERATOR - if it's the last and ending with operator, do nothing
            elseif ($last && ($str_nor == 'AND' || $str_nor == 'NOT' || $str_nor == 'OR'))
            {
                null;
            }
            else //this is the last search word - without trailing operator, so create a default AND query
            {
                $str_db = $str_db . " +(" . $str_nor . "* " . $str_rev . "*)";
            }
        
            if ($found_op){
                $i++; //for skipping next scan in the loop
                if (count($str_array) - $i == 1 ) {
                    $last = true;
                }
            }
    
            if ($last) //last scan - add the trailing end to the query
            {
                $str_db = $str_db . '" in boolean mode)';
                break;
            }
        }
    }
    echo $str_db;
    
  • allanallan Posts: 63,761Questions: 1Answers: 10,510 Site admin

    That's outstanding - I can see this being used for an "advanced search" similar to what JIRA does for its JQL.

    Thanks for sharing it!

    Allan

  • tefdattefdat Posts: 42Questions: 7Answers: 3

    :smile:
    The quoted edited in this thread is not perfect and does not handle last/first Operators well with the "real_fts" search. The sandbox-edition is up2date.
    Not very well programmed, but i tested all combinations on my installation and with the query-log of MariaDB.

    The capabilities of the default search of the PHP libs are compared to the client-side version not that well. The result for every search pattern with more than words is not good and pretty slow on bigger tables.
    I see here a big potential :smiley:

    I really suggest to everyone to just spare another column where you concatenate your columns with SQL or preferably with VBA, Powershell and Index this column. Even on smaller tables. Reversed column are not really a must-have.
    But with it, the search is almost perfect and really fast.
    With the upper described routines pretty close to the real amazing search of the client side edition.


    So - I am glad, if I was able to inspire you to offer an improved search in one of the next releases :)

This discussion has been closed.