Massive slow down with more than one leftjoin

Massive slow down with more than one leftjoin

globalplaneglobalplane Posts: 70Questions: 12Answers: 0

Link to test case:
Debugger code (debug.datatables.net):
Error messages shown:
Description of problem:

Possibly similar to https://datatables.net/forums/discussion/comment/191112

I'm using the Editor PHP library and trying to add multiple LEFT JOINS. With one, it seems to work fine. But if I add a second, then almost every time I try to load the page, it times out with a 504 error. It's worked a few times, I think without me changing anything.

Code that works:

->leftJoin(
        'mc_revs_language_id',
        'mc_revs_language_id.row_index = main_collection.main_index AND mc_revs_language_id.timestamp = (SELECT MAX(LIZ.timestamp) FROM mc_revs_language_id LIZ WHERE LIZ.row_index = main_collection.main_index)'
    )

Code that times out:

->leftJoin(
        'mc_revs_language_id',
        'mc_revs_language_id.row_index = main_collection.main_index AND mc_revs_language_id.timestamp = (SELECT MAX(LIZ.timestamp) FROM mc_revs_language_id LIZ WHERE LIZ.row_index = main_collection.main_index)'
    )
->leftJoin(
        'mc_revs_level',
        'mc_revs_level.row_index = main_collection.main_index AND mc_revs_level.timestamp = (SELECT MAX(LZ.timestamp) FROM mc_revs_level LZ WHERE LZ.row_index = main_collection.main_index)'
    )

This question has an accepted answers - jump to answer

Answers

  • globalplaneglobalplane Posts: 70Questions: 12Answers: 0
    edited May 2023

    It varies too. Now it seems that two leftjoin()'s works fine, but four of them times out. (I need six total.)

    Here's the full code of the server-side script, with only one leftjoin active, the rest commented out:

    <?php
        ini_set('display_errors', 1);
        ini_set('display_startup_errors', 1);
        error_reporting(E_ALL);
    
        
    
    include( "/var/www/comprehensibleinputwiki.com/html/editor-php/lib/DataTables.php" );
    
    
     
    use 
        DataTables\Editor,
        DataTables\Editor\Field,
        DataTables\Editor\Format,
        DataTables\Editor\Mjoin,
        DataTables\Editor\Options,
        DataTables\Editor\Upload,
        DataTables\Editor\Validate,
        DataTables\Editor\SearchPaneOptions;
    
    
    
    $editor = Editor::inst( $db, 'main_collection', 'main_index' )
        ->fields(
            Field::inst( 'main_collection.main_index' ),
            Field::inst( 'main_collection.id' , 'vid')
                ->getFormatter( function ( $val, $data ) {
                return '<iframe height="200" src="https://www.youtube-nocookie.com/embed/' . $val . '" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>';
                } ),
            Field::inst( 'main_collection.channel', 'channelTitle' )
                ->searchPaneOptions( SearchPaneOptions::inst() ),
            Field::inst( 'main_collection.language_id', 'language' ),
            Field::inst( 'main_collection.title', 'title' ),
            Field::inst( 'main_collection.level', 'level' ),
            Field::inst( 'main_collection.enabled', 'enabled' )
    
        )
        ->leftJoin(
            'mc_revs_language_id',
            'mc_revs_language_id.row_index = main_collection.main_index AND mc_revs_language_id.timestamp = (SELECT MAX(LIZ.timestamp) FROM mc_revs_language_id LIZ WHERE LIZ.row_index = main_collection.main_index)'
        )/*
        ->leftJoin(
            'mc_revs_level',
            'mc_revs_level.row_index = main_collection.main_index AND mc_revs_level.timestamp = (SELECT MAX(LZ.timestamp) FROM mc_revs_level LZ WHERE LZ.row_index = main_collection.main_index)'
        )
        ->leftJoin(
            'mc_revs_enabled',
            'mc_revs_enabled.row_index = main_collection.main_index AND mc_revs_enabled.timestamp = (SELECT MAX(EZ.timestamp) FROM mc_revs_enabled EZ WHERE EZ.row_index = main_collection.main_index)'
        )
        ->leftJoin(
            'yt_channel_ids',
            'yt_channel_ids.channelId = main_collection.channel'
        )*/
        
        ->write( false )
        ->process( $_GET )
        ->json();
    
  • globalplaneglobalplane Posts: 70Questions: 12Answers: 0

    Test case: https://comprehensibleinputwiki.com/ciwlibrary/test_case.php

    Currently working since it's only one left join, but if I uncommented the rest, it would just timeout and error trying to get the serverside script.

    Is there some sort of verbose log for the PHP library that might help me narrow down the issue?

  • globalplaneglobalplane Posts: 70Questions: 12Answers: 0

    Ok, so it might actually be really slow SQL. I turned on debug (and it started actually retrieving the page for some reason, but now a new deprecated warning appears, so the JSON is broken: Deprecated: strcmp(): Passing null to parameter #1 ($string1) of type string is deprecated in /var/www/comprehensibleinputwiki.com/html/editor-php/lib/Editor/SearchPaneOptions.php on line 449

    Anyway, I looked at the SQL in the debug file, and I tried running just one of the five or so queries directly in mysql, when I have two left joins. It took about 4.5 seconds.

    SELECT 
      `main_collection`.`main_index` as 'main_collection.main_index', 
      `main_collection`.`id` as 'main_collection.id', 
      `main_collection`.`channel` as 'main_collection.channel', 
      `mc_revs_language_id`.`value` as 'mc_revs_language_id.value', 
      `main_collection`.`title` as 'main_collection.title', 
      `mc_revs_level`.`value` as 'mc_revs_level.value', 
      `main_collection`.`enabled` as 'main_collection.enabled' 
    FROM 
      `main_collection` 
      LEFT JOIN `mc_revs_language_id` ON mc_revs_language_id.row_index = main_collection.main_index 
      AND mc_revs_language_id.timestamp = (
        SELECT 
          MAX(LIZ.timestamp) 
        FROM 
          mc_revs_language_id LIZ 
        WHERE 
          LIZ.row_index = main_collection.main_index
      ) 
      LEFT JOIN `mc_revs_level` ON mc_revs_level.row_index = main_collection.main_index 
      AND mc_revs_level.timestamp = (
        SELECT 
          MAX(LZ.timestamp) 
        FROM 
          mc_revs_level LZ 
        WHERE 
          LZ.row_index = main_collection.main_index
      ) 
    ORDER BY 
      `main_collection`.`id` asc 
    LIMIT 
      10
    

    But if I run SQL that I had written previously that does almost the same thing, except has all six desired joins, it takes 0.030 seconds:

    WITH cte_enabled AS (
      SELECT 
        row_index, 
        value AS enabled_value, 
        (
          ROW_NUMBER() OVER (
            PARTITION BY mc_revs_enabled.row_index 
            ORDER BY 
              mc_revs_enabled.timestamp DESC
          )
        ) as enabled_seq 
      from 
        mc_revs_enabled
    ), 
    cte_level AS (
      SELECT 
        row_index, 
        value AS level_value, 
        (
          ROW_NUMBER() OVER (
            PARTITION BY mc_revs_level.row_index 
            ORDER BY 
              mc_revs_level.timestamp DESC
          )
        ) as level_seq 
      from 
        mc_revs_level
    ), 
    cte_language_id AS (
      SELECT 
        row_index, 
        value AS language_id_value, 
        (
          ROW_NUMBER() OVER (
            PARTITION BY mc_revs_language_id.row_index 
            ORDER BY 
              mc_revs_language_id.timestamp DESC
          )
        ) as language_id_seq 
      from 
        mc_revs_language_id
    ) 
    SELECT 
      main_index, 
      id, 
      title, 
      difficulty.full as difficulty, 
      yt_channel_ids.channelTitle as channel, 
      difficulty.list_order as difficulty_ordered 
    FROM 
      main_collection 
      INNER JOIN cte_enabled ON main_collection.main_index = cte_enabled.row_index 
      LEFT JOIN cte_level ON main_collection.main_index = cte_level.row_index 
      LEFT JOIN cte_language_id ON main_collection.main_index = cte_language_id.row_index 
      LEFT JOIN difficulty ON cte_level.level_value = difficulty.difficulty_index 
      LEFT JOIN languages ON cte_language_id.language_id_value = languages.language_id 
      LEFT JOIN yt_channel_ids ON channel = yt_channel_ids.channelId 
    WHERE 
      enabled_value = 1 
      AND (
        enabled_seq = 1 
        OR enabled_seq IS NULL
      ) 
      AND (
        language_id_seq = 1 
        OR language_id_seq IS NULL
      ) 
      AND (
        level_seq = 1 
        OR level_seq IS NULL
      ) 
    LIMIT 
      10;
    

    It looks like DataTable->sql() might let me use my own SQL to speed this up, so I'll try that.

  • globalplaneglobalplane Posts: 70Questions: 12Answers: 0

    Ooh ok, so I just had to create a VIEW using my faster SQL, then reference that from Datatables. The data seems to be returning quickly.

    The only issue now is that the server is returning multiple Deprecated messages before the JSON, as I quoted in a previous comment, so the JSON can't be used for the table. So if anyone could tell me how to fix the issue or turn off the message, that'd be great, thanks.

  • allanallan Posts: 63,799Questions: 1Answers: 10,514 Site admin
    Answer ✓

    Hi,

    Good to hear you found a VIEW to make things afters. One thing about the left joins - the additional left joins weren't selecting anything from the database - i.e. mc_revs_enabled didn't have any data pulled from it. That might well be what was hurting performance.

    Deprecated: strcmp()

    Thanks for letting me know about that. We've been testing with PHP 8.2 which should have caught this, but we don't have null data in the SearchPanes example - so never saw it! I've committed the fix here.

    You can either update your SearchPaneOptions.php file with that change, or add ->order('channelTitle') to your SearchPaneOptions instance to get the SQL server to do the sorting.

    Allan

  • globalplaneglobalplane Posts: 70Questions: 12Answers: 0
    edited May 2023

    One thing about the left joins - the additional left joins weren't selecting anything from the database - i.e. mc_revs_enabled didn't have any data pulled from it. That might well be what was hurting performance.

    I don't think that's it, because I was trying many combinations of SQL, including using data from every table that was joined. I think it's an issue with my database structure, which I'm investigating.

    You can either update your SearchPaneOptions.php file with that change

    Updated and works, thank you!

  • globalplaneglobalplane Posts: 70Questions: 12Answers: 0
    edited May 2023

    Looks like the issue was not having an INDEX on the row_index columns of the tables I was joining. It seems to work fast with the old code now.

  • allanallan Posts: 63,799Questions: 1Answers: 10,514 Site admin

    Ah - yes, no index on the joined column would do it.

    Thanks for letting me know it is working now.

    Allan

This discussion has been closed.