SQL character encoding issue:if target column is encoded with ut8mb4_unicode_ci

SQL character encoding issue:if target column is encoded with ut8mb4_unicode_ci

tefdattefdat Posts: 42Questions: 7Answers: 3

Hi,
I was using encoding utf8_general_ci in the database previously.
Switched to another table, where I decided to select for encoding utf8mb4_unicode_ci
(for maximum coverage of smilies and asian characters).

This produces headaches currently. I am getting following alert -
if I use special character (ü, ö, ä, ß etc.) in the search pattern:
DataTables warning: table id=example - JSON encoding error: Malformed UTF-8 characters, possibly incorrectly encoded
In Chrome debugger:
error: "JSON encoding error: Malformed UTF-8 characters, possibly incorrectly encoded"

The logged (shortened here) datatables query (MySQL log) is [Remark: I am using an FTS Index]:
SELECT id, CreateDate, QualificationDate,..... FROM mytable WHERE (Match(fts_obverse, fts_reverse) Against ("+(w眉nscht* thcsn济w*)" in boolean mode)) ORDER BY id DESC LIMIT 0, 250
The special character 'ü' is encoded strangly, but its working - if I switch back the two columns back to **utf8_general_ci.**

I used the search here in the forum and in stackoverflow and tried the hints / workarounds. No fix so far.
What I tried:
* issue is happening with Editor-SQL Libs (which I am currently using) as well with the free ssp.class.php.
* ssp_class_php: i tried to add the $sql_details array 'dsn' => 'charset=utf8mb4_unicode_ci'
* Editor-Libs: i tried to add to the $db->sql("SET character_set_client=utf8"); $db->sql("SET character_set_connection=utf8"); $db->sql("SET character_set_results=utf8");

What I can do else - any hints?
Thx :)

This question has an accepted answers - jump to answer

Answers

  • allanallan Posts: 63,214Questions: 1Answers: 10,415 Site admin
    Answer ✓

    Try:

    $db->sql('set names utf8mb4 collate utf8mb4_unicode_ci');
    

    Allan

  • tefdattefdat Posts: 42Questions: 7Answers: 3
    edited October 2020

    Thank you Allen. Adding this line is adding following line before starting the SQL transaction, right?
    Query set names utf8mb4 collate utf8mb4_unicode_ci
    Query START TRANSACTION

    This was not the issue, but found it - my bad :p
    There was a bug made by me, which where present the whole time but did not bring out any fault with utf8 encoding. Just now with utfmb4.

    After I change in Notepad++ the proper character encoding UTF8 i figured out:
    WHERE (Match(fts_obverse, fts_reverse) Against (" +(wünsch* hcsn¼÷*)" in boolean mode)

    The PHP strrev function does not support UTF8 :s
    This code from php.net does do the trick:
    $chars = mb_str_split($str_array[$i], 1, null ?: mb_internal_encoding());
    $str_rev = implode('', array_reverse($chars));

    Now everything is good:
    WHERE (Match(fts_obverse, fts_reverse) Against (" +(wünsch* hcsnüw*)" in boolean mode)

    Thank you again!

This discussion has been closed.