Left Join - Aggregate Function - Dynamic Variable Computing Aggregate

Left Join - Aggregate Function - Dynamic Variable Computing Aggregate

iqvctiqvct Posts: 45Questions: 7Answers: 0
edited May 2022 in Free community support

I'm having a hard time wrapping my mind around the best way to accomplish this.

I will simplify the scenario as much as possible to focus on the end goal.

Below are the two tables which are stored in a SQL Server database:

UserBalance

UserId FirstName LastName Subtotal
1 Airi Satou 500
2 Ashton Cox 500
3 Bradley Greer 500

Payments

UserId Month Payment
1 1 100
2 1 200
3 1 200
1 2 200
2 2 100
3 2 300
1 3 400
2 3 100
3 3 200

The plan is for the front end to allow the user to select a month. If month 3 is selected, all months 1-3 will be summed up by UserId. This summed up total is then joined back to the UserBalance table.

From a SQL syntax perspective, we'd be looking for something like the following to compute the aggregate field:
(Month would be derived from a select box on the front end)

SELECT UserId, SUM(Payment) AS TotalPayments FROM Payments WHERE Month <= 3 GROUP BY UserId

That would generate the below output:

TempTable

UserId TotalPayments
1 700
2 400
3 700

The final table would .LeftJoin the UserBalance table with the TempTable on UserId and output as such:
(Total field rendered in JS)

UserLiveBalance

UserId FirstName LastName Subtotal TotalPayments Total
1 Airi Satou 500 700 1200
2 Ashton Cox 500 400 900
3 Bradley Greer 500 700 1200

I understand the basics of .LeftJoin(). I have also worked with bringing an HTML Select box field value into the controller with request.Form["month"]. I'm not starting completely blind, but it seems the issue is the way my approach is sequencing the process. I'm making some type of temporary table that I'm trying to join to which I'm just not sure how to accomplish.

Any suggestions on the best logic to go about accomplishing this?

This question has an accepted answers - jump to answer

Answers

  • rf1234rf1234 Posts: 3,024Questions: 88Answers: 422
    edited May 2022

    You can't do SELECT SUM and GROUP BY with Editor - and you don't need to either!

    Just make suitable SQL views and left join them. There's only one disadvantage: Usually you can't update those views. But that is probably not required in your case, I guess.

    I'm making some type of temporary table that I'm trying to join to which I'm just not sure how to accomplish.

    What you call "temporary table" is just a different word for "view" :smile:

  • rf1234rf1234 Posts: 3,024Questions: 88Answers: 422

    Here is an example involving two views. Here I leftJoin with a view that references another view because I couldn't get my "type of temporary table" in one step. I use "->set( false )" because I don't want to update the view which I couldn't anyway.

    The table is called "contract" and the left joined table which is the view is called "contract_exp_date".

    ....
    Field::inst( 'contract_exp_date.exp_date' )->set( false )
                ->getFormatter( function ( $val, $data, $opts ) {
                    return getFormatterDate($val);                   
                } ),
    .....
    ->leftJoin( 'contract_exp_date', 'contract.id', '=', 'contract_exp_date.contract_id')
    

    The view:

    CREATE VIEW `contract_exp_date` AS
      SELECT max(max_exp_date) AS `exp_date`, contract_id
        FROM contract_exp_dates_input 
    GROUP BY 2
    

    The view referenced in the view above:

    CREATE VIEW `contract_exp_dates_input` AS
       SELECT MAX(a.end_date) AS `max_exp_date`, c.id AS `contract_id`
         FROM `fixed` a
    LEFT JOIN contract_has_fixed b ON b.fixed_id = a.id
    LEFT JOIN `contract` c ON b.contract_id = c.id
         WHERE a.end_date IS NOT NULL 
           AND a.plan_element < 1 
      GROUP BY 2
               UNION ALL
       SELECT MAX(a.end_date) AS `max_exp_date`, c.id AS `contract_id`
         FROM `variable` a
    LEFT JOIN contract_has_variable b ON b.variable_id = a.id
    LEFT JOIN `contract` c ON b.contract_id = c.id
        WHERE a.end_date IS NOT NULL 
          AND a.plan_element < 1 
     GROUP BY 2
    
  • iqvctiqvct Posts: 45Questions: 7Answers: 0
    edited May 2022

    @rf1234 Thank you for the thorough response!

    I actually tend to use views in almost all of my DataTables instances rather than doing joins in the controller. I create what I want the user to see on the backend in SQL and then simply reference that view.

    Unfortunately, unless I'm missing something, I'm not seeing how the example you're providing would solve this critical part of the issue:

    The plan is for the front end to allow the user to select a month. If month 3 is selected, all months 1-3 will be summed up by UserId. This summed up total is then joined back to the UserBalance table.

    I somehow need the user to be able to make a selection which will impact the values returned from the view.

    The psuedo code would be something like :

    SELECT UserId, SUM(Payment) AS TotalPayments 
    FROM Payments 
    WHERE Month <= (USER INPUT)
    GROUP BY UserId
    

    In the past I've used the controller and where clauses to modify the data returned from a table (or view), but in this case, I need a View to join to another View where the second view is made on the fly based on the input of a user.

    Let me know if that makes sense.

  • rf1234rf1234 Posts: 3,024Questions: 88Answers: 422

    Unfortunately, unless I'm missing something, I'm not seeing how the example you're providing would solve this critical part of the issue:

    You are right, the example doesn't provide that. And it wasn't meant to :smile:. You didn't mention that you are using views already. Hence I thought just using views could be a solution for you.

    But there is one important thing about views that you might have overlooked: All the values that you select from a view can be used in the where clause of a query on that view!

    What does that mean?
    You would need to create a view that reads the "month" (or a period until the end of that month) so that you can use this in the where clause of a statement querying the view. This query could use e.g. a POST variable in the where clause containing the user input.

    If you need an example (no, not precisely yours, but one that makes my point) just let me know.

  • iqvctiqvct Posts: 45Questions: 7Answers: 0
    edited May 2022

    If you need an example (no, not precisely yours, but one that makes my point) just let me know.

    I'll never say no to an example to provide more clarity, so yes, and example would be tremendously helpful.

    I understand creating a view that houses all of the data. And then a where clause controlled by a user which can filter the data so we're only seeing data that is <= a certain month.

    But I'm still not seeing how to get around the need to ultimately do a SUM/GROUP BY after the data has been trimmed down.

    Sorry, I'm clearly missing your point :#

  • rf1234rf1234 Posts: 3,024Questions: 88Answers: 422
    edited May 2022

    Too much overhead in my example, I guess. So I tried with your pseudo code.

    SELECT UserId, month, SUM(Payment) AS TotalPayments
    FROM Payments
    GROUP BY 1, 2
    

    And in the where clause you specifiy what that month is based on the user input. That should make the "group" month to collapse to one - namely the month specified.

    So

    WHERE Month <= (USER INPUT)
    

    is in the where clause and not in the view defintion.

    No I guess it wont collapse to one because there are several months <= user input, but you know what I mean. You'll find a way :smile:

    If "month" was a "date_until" it would work.

    WHERE date_until <= (User Input)
    
    SELECT UserId, date_until, SUM(Payment) AS TotalPayments
    FROM Payments
    GROUP BY 1, 2
    
  • iqvctiqvct Posts: 45Questions: 7Answers: 0
    edited May 2022

    In your SQL code, what is the

    GROUP BY 1, 2
    

    meant to represent?

  • rf1234rf1234 Posts: 3,024Questions: 88Answers: 422

    column 1 and column 2 of the view. i.e. UserId and date_until.

  • iqvctiqvct Posts: 45Questions: 7Answers: 0

    No I guess it wont collapse to one because there are several months <= user input, but you know what I mean. You'll find a way :smile:

    Hah! I think this is primarily where I'm lost to be honest. I understand it's feasible to get all of that data in front of the user, but in my use case, if there are 300 people with 3 months each I need the result to be 300 lines, not 900 lines.

    I've thought of ways of getting "outside-the-box" creative, like adding a "RunningTotal" field to a new view.

    The new view would look like this:

    Payments

    UserId Month Payment RunningTotal
    1 1 100 100
    1 2 200 300
    1 3 400 700
    2 1 200 200
    2 2 100 300
    2 3 100 300
    3 1 200 200
    3 2 300 500
    3 3 200 700

    Then if a user selected 3 for month, we'd only pull in the single line and not have to do any sums.

    I suppose I could do some sort of SUM/OVER function to accomplish above idea:

    SELECT UserId,Month,
      SUM(Payment) OVER (ORDER BY UserId)
      AS RunningTotal
    FROM Payments
    WHERE Month = 3 --(USER INPUT)
    

    Then I'd get

    UserId Month RunningTotal
    1 3 400
    2 3 500
    3 3 700

    Am I over complicating this and still missing your point or does this seem like a logical solution?

  • rf1234rf1234 Posts: 3,024Questions: 88Answers: 422
    edited May 2022

    If you work with a "date_until" it would work. You will get your 300 lines and not 900. How you render "date_until" and what values of date until you allow as user input is also only a question of rendering and the right dropdowns.

    You can have a dropdown for the user with months 1 to 12. And if they choose month 3 you would "translate" this into March 31st for your query etc. Hence you would take all payments made from Jan 1st until March 31st as the "sum".

    If I had your data model it would be a bit easier. But let me try again with pseudo code. Let's assume your table has a "payment_date".

    The where clause would be:

    WHERE payment_date BETWEEN 'year of user input-01-01' AND (User Input)
    
    SELECT UserId, payment_date, SUM(Payment) AS TotalPayments
    FROM Payments
    GROUP BY 1, 2
    

    This means your new view would look differently from what you think :smile:

  • iqvctiqvct Posts: 45Questions: 7Answers: 0
    edited May 2022

    Sorry, I feel like I'm wasting your time at this point, but I still am not following.

    Stepping back, the way I'm seeing this is that there is a View created within my database. The SQL syntax that generates this view is static and is not going to be modified by the controller.

    The user input, which modifies the controller, is filtering data retrieval from the view. It's not going to modify the underlying SQL which generates the view.

    So in my presented example here that calculation of SUM was already done in the new RunningTotal field. Now we're only selecting which calculated value we want by choosing the associated Month.

    ...and as I'm typing this out I'm starting to possibly understand my misunderstanding. Is this disconnect essentially the order of operation in SQL? Is the WHERE clause that I add into my controller going to modify the View before the SELECT statement including the sum occurs? Does the WHERE clause actually modify the underlying table values because of order of operation?

    EDIT: And to clarify the Month field, this alone should work as there is specifically a field in the data called Month with an integer value. A WHERE clause stating WHERE Month <= 3 --(User Input) should be all I need to sum data from months 1-3.

  • rf1234rf1234 Posts: 3,024Questions: 88Answers: 422
    edited May 2022

    I tested something similar.

    You would need to modify this:

    WHERE payment_date BETWEEN 'year of user input-01-01' AND (User Input)
    
    SELECT UserId, payment_date, SUM(Payment) AS TotalPayments
    FROM Payments
    GROUP BY 1
    

    you only want to group this by user id. Then you get the sum by user for the payment dates between "beginning of the year" and "end of your respective month".

    Depending on your SQL database settings this might cause problems because a full group by on all non-aggregated columns might be enforced. That would break the solution and currently I wouldn't know an alternative

    Is the WHERE clause that I add into my controller going to modify the View before the SELECT statement including the sum occurs? Does the WHERE clause actually modify the underlying table values because of order of operation?

    No the where clause is a parameter for your view.

    Just think of the columns you select in your view as a table. That's all. Not more than that. Then you do what you do with a table. That's it.

  • iqvctiqvct Posts: 45Questions: 7Answers: 0
    edited May 2022

    As I saw your response come through I was literally typing out something very similar. Thought I was finally understanding, but then realized the GROUP BY of Month would create an issue and not allow the data to be aggregated:

    Column 'dbo.Payments.Month' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

    Which seems to reinforce my suspicions earlier. Since we're not modifying the underlying View, we must include Month so we can filter by Month. But if that's the case, the GROUP BY component won't aggregate. Bummer.

    My RunningTotal solution from above may ultimately be the way to go if there's no way to enforce a full aggregate so that we're only grouping by UserId.

  • rf1234rf1234 Posts: 3,024Questions: 88Answers: 422

    Yep, looks like you are having this database problem. You would need to find a work around then.

  • iqvctiqvct Posts: 45Questions: 7Answers: 0

    Certainly appreciate the back and forth. I'll leave this open a little longer to see if there are any other suggestions/potential workarounds.

    Otherwise I'll mark one of your responses as an answer.

  • rf1234rf1234 Posts: 3,024Questions: 88Answers: 422
    edited May 2022

    Nesting could help here, too. Like in my example above. In a first step with view "solution _input" you would select the "TotalDailyUserPayments" until the respective payment_date. After that you would aggregate those payments to the "TotalUserPayments". In the where clause you specify the relevant dates of "solution_input". You would select "TotalUserPayments" and you would need to leftJoin both views.

    CREATE VIEW `solution_input` AS
    SELECT UserId, payment_date, SUM(Payment) AS TotalDailyUserPayments
    FROM Payments
    GROUP BY 1, 2
    
    CREATE VIEW `solution` AS
    SELECT UserId, SUM(TotalDailyUserPayments) AS TotalUserPayments
    FROM solution_input
    GROUP BY 1
    
    ->leftJoin( 'solution_input', 'Payments.UserId', '=', 'solution_input.UserId')
    ->leftJoin( 'solution', 'solution_input.UserId', '=', 'solution.UserId')
    
    ....
    WHERE solution_input.payment_date BETWEEN 'year of user input-01-01' AND (User Input)
    

    The WHERE clause would of course be in PHP or .NET or whatever you are using with Editor.

  • rf1234rf1234 Posts: 3,024Questions: 88Answers: 422

    As an alternative you can do an Mjoin, return all of the respective user payments to the client and sum them up on the client side. You'll find quite some examples on this in the forum.

  • rf1234rf1234 Posts: 3,024Questions: 88Answers: 422
    Answer ✓

    And this is my last one, I swear :smile:

    I usually find Mjoins a bit unflexible when it comes to more complex joins over multiple tables. They don't work for that. Complex views can be cumbersome, too.

    So in many cases I use solutions like this: I embed my ownSQL statements into Editor using Editor's db-handler.

    This simple statement should do the trick. All you need to return from the server is the user_id and the sum of the user_payments for the respective period. You would need to post the period's start and end dates to the server of course.

    The only downside is that the SQL statement gets executed for each record selected by Editor. But that doesn't matter too much unless you are retrieving 100,000 records at a time.

    Editor::inst( $db, 'payments' )
        ->field(
            Field::inst( 'user_id' ),
            Field::inst( 'user_id AS user_payments' ) ->set( false )
                ->getFormatter( function ( $val, $data, $opts ) use ( $db ) {
                        $row = $db->raw()
                            ->bind( ':date_from',   $_POST["date_from"] )
                            ->bind( ':date_until',  $_POST["date_until"] )
                            ->bind( ':user_id',     $val )    
                            ->exec('SELECT COALESCE(SUM(payment), 0) AS user_payments
                                      FROM `payments`
                                     WHERE user_id  = :user_id
                                       AND payment_date BETWEEN :date_from
                                                            AND :date_until')
                            ->fetch(PDO::FETCH_ASSOC);
                        return $row["user_payments"];
                    } ), ...
        )
    

    If that doesn't work because Editor doesn't do a SELECT DISTINCT on user_id you can still add the column after doing "array_unique" on the user id using the "postGet" event handler. I've done that too when reading the log where I cannot apply a where clause to fields embedded in a JSON string.

    Here I want the server to return the sum of an amount column as the final row. I call it "row_0" to avoid overlap with Editor's DT_RowId. The amount column itself should be read from the database but not returned from the server. That's why I unset it after creating the sum with PHP. $data is passed by reference so I can manipulate it before it gets returned to the client.

    ->on( 'postGet', function ( $e, &$data, $id ) use ( $lang ) { 
        if ( count($data) <= 0 ) {
            return;
        }
        $sum = array_sum(array_column($data, "unformattedAmount"));
        $keys = array_keys($data);
        foreach ( $keys as $key ) {
            unset($data[$key]["unformattedAmount"]);
        }
        if ( $sum != 0 ) {
            $data[] = [ "DT_RowId" => "row_0", //in reality there is no row 0 because it is derived from the SQL id
                        "sub_earmark" => 
                            [ "ctr_id"           => $_POST['ctr_id'], 
                              "earmark_subject"  => "",
                              "earmark_exp_date" => $lang === "de" ? "Summe" : "Total",
                              "earmark_amount"   => getFormatterAmount($sum) ] ];
        }
    })
    
  • iqvctiqvct Posts: 45Questions: 7Answers: 0

    A welcome sight to start the day with, multiple alternatives! I certainly appreciate the effort in writing these up @rf1234

    Nesting could help here too.

    I follow the logic in this one and it makes complete sense. I've been going back and forth on whether or not to go with the nested view, or if I should make that single view with a running total. To that point though mine would still need updating. I left out the PARTITION BY component which is straight forward, but I'd have to think about how the WHERE clause would get integrated since it would throw off the running total as the SUM function would happen after the WHERE clause.

    So in many cases I use solutions like this: I embed my ownSQL statements into Editor using Editor's db-handler.

    I believe I follow the general flow here, but if I'm being honest with myself, it's likely outside the scope of my current understanding to try and translate this to .NET and apply specifically to my exact scenario and not this watered down for simplicity scenarios we have created.

    I'm just a finance guy teaching myself this stuff as I go. I may give it a shot, but that tends to take me down many rabbit holes. A great learning experience! But always time consuming :|

Sign In or Register to comment.