2

Say we have two tables performing a left join:

Table 1

Joint Key  || Attribute 1 || Attribute 2 || Attribute 3
   A             1               11           21        
   B             2               12           22
   C             3               13           23

Table 2

Joint Key  || Attribute 4 || Attribute 5 
   A             31               41      
   A             32               42      
   C             33               43    

by performing a table 1 left join table 2 on "Joint Key" it will return two records having Joint Key = 'A'

Joint Key  || Attribute 1 || Attribute 2 || Attribute 3 || Attribute 4 || Attribute 5 
   A             1               11           21              31               41    
   A             1               11           21              32               42  

What's the best practice of defining the return police, specifically in snowflake, that can return me the same row count as table 1.

Taking the above example, I want the the record has the MAX(Attribute 4). Two initial ideas come to my mind

Option 1: use "GROUP BY" clause -- need list columns explicitly, cumbersome when dealing with table has many columns.

Option 2: something like

select * from (
  select 
    Tabel1.*
    max(Table2.Attribute_4) as mx_Attribute_4,
    Table2.Attribute_5
 from Table1
 left join Table2
 on Joint_Key
) as temp
where temp.Attribute_4 = temp.mx_Attribute_4

it's quite complicated and time-consuming too.

Any other suggestions?

2 Answers 2

2

you could use QUALIFY

Something like:

select
    t1.Joint_key, t1.Attribute_1, t1.Attribute_2, t1.Attribute_3, t2.Attribute_4, t2.Attribute_5
from Table1 t1
left join Table2 t2
    on t1.Joint_key = t2.Joint_key
qualify row_number() over(partition by Joint_Key order by Attribute_4 desc) = 1

This is certainly more clean, and should be more efficient than a group by. It does still require the query to sort records by Attribute_4, but I don't see a way of avoiding that unless you are ok with using any of the sets of values instead of the one with MAX(Attribute_4). In that case you could be more efficient by using order by 1 in the row_number() window function.

1
  • Thank you! I tried something like this with correlated subquery but snowflake doesn't support it. This is the answer I am after!
    – QPeiran
    Commented Sep 28, 2021 at 20:43
0

You seem to have some confused ideas about how joins work. If you have Table1 left join Table2 then it will return all the records from Table1 with any data from matching records in Table2 - so in your case you would normally get the 3 records from Table1.

However, in your case you have 2 records in table2 that matches 1 record in table 1 so this will duplicate your results and you will get 4 records: 2 with key A and then 1 with B and 1 with C.

Anyway, given the example data you’ve provided, please update your question with the result you want to achieve so that someone can help you

1
  • I interpret the question as having a pretty good understanding of left join. The results in the question are only including the problem rows, which is the "duplicated" row with key 'A'. And the requested output is to have those two output rows consolidated into one row. Commented Sep 28, 2021 at 14:30

Not the answer you're looking for? Browse other questions tagged or ask your own question.