Merge 3 selects into a single select

Asked

Viewed 231 times

0

I’m using the Warehouse Snowflake, I have to make some selects

1° Select

select
    WAREHOUSE_NAME,
    TOTAL_ELAPSED_TIME,
    QUERY_TEXT,
    START_TIME
from
    SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
where
    TOTAL_ELAPSED_TIME in (
        select
            max(TOTAL_ELAPSED_TIME)
        from
            SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
        where
            START_TIME BETWEEN $ start_time
            and dateadd(days, 1, $ start_time)
            and WAREHOUSE_ID is not NULL
        group by
            WAREHOUSE_ID
    )
order by
    WAREHOUSE_ID;

2° Select

select
    max (NULLIF((QUEUED_OVERLOAD_TIME), 0)) MAX_Q_O_TIME,
    avg (NULLIF((QUEUED_OVERLOAD_TIME), 0)) AVG_Q_O_TIME,
    count (NULLIF((QUEUED_OVERLOAD_TIME), 0)) RECORDS_Q_O_TIME
from
    SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
where
    START_TIME between $start_time
    and dateadd(days, 1, $start_time)
    and WAREHOUSE_ID is not NULL
group by
    WAREHOUSE_ID
order by
    WAREHOUSE_ID;

3° Select

select
    count(*) RECORDS_T_E_T,
    avg (TOTAL_ELAPSED_TIME) AVG_T_E_T
from
    SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
where
    START_TIME between $ start_time
    and dateadd(days, 1, $ start_time)
    and WAREHOUSE_ID is not NULL
group by
    WAREHOUSE_ID
order by
    WAREHOUSE_ID;

Example of how to return the first select :

WAREHOUSE_NAME | TOTAL_ELAPSED_TIME | QUERY_TEXT | START_TIME
   WAREHOUSE_1 |               1000 |   EXEMPLE1 | 2019-10-29 01:03:25.548 -0700
   WAREHOUSE_2 |               2000 |   EXEMPLE2 | 2019-10-29 01:00:25.548 -0700

Example of how to return the second select :

MAX_Q_O_TIME | AVG_Q_O_TIME | RECORDS_Q_O_TIME
        1000 |         2000 |               10
        NULL |         NULL |                0

Example of how the third select returns :

RECORDS_T_E_T | AVG_T_E_T
          100 |       200
           50 |       300

They’re just examples, they’re not the same values

I would like to unite all this in one

WAREHOUSE_NAME | TOTAL_ELAPSED_TIME | QUERY_TEXT | START_TIME |  MAX_Q_O_TIME | AVG_Q_O_TIME | RECORDS_Q_O_TIME | RECORDS_T_E_T | AVG_T_E_T
   WAREHOUSE_1 |               1000 |   EXEMPLE1 | 2019-10-29 |          1000 |         2000 |               10 |          100 |        200
   WAREHOUSE_2 |               2000 |   EXEMPLE2 | 2019-10-29 |          NULL |         NULL |                0 |           50 |        300

I renamed some things and in the last start time I cut the hours & minutes&seconds to not get as big the example

I tried to do something like

select
    a.*,
    b.*,
    c.*
from
    (
        select
            WAREHOUSE_NAME,
            TOTAL_ELAPSED_TIME,
            QUERY_TEXT,
            START_TIME,
            WAREHOUSE_ID
        from
            SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
        where
            TOTAL_ELAPSED_TIME in (
                select
                    max(TOTAL_ELAPSED_TIME)
                from
                    SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
                where
                    START_TIME between $start_time
                    and dateadd(days, 1, $start_time)
                    and WAREHOUSE_ID is not NULL
             group by
            WAREHOUSE_ID
            )
        order by
            WAREHOUSE_ID
    ) as a,
    (
        select
            max (NULLIF((QUEUED_OVERLOAD_TIME), 0)),
            avg (NULLIF((QUEUED_OVERLOAD_TIME), 0)),
            count (NULLIF((QUEUED_OVERLOAD_TIME), 0))
        from
            SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
        where
            START_TIME between $start_time
            and dateadd(days, 1, $start_time)
            and WAREHOUSE_ID is not NULL
        group by
            WAREHOUSE_ID
        order by
            WAREHOUSE_ID
    ) as b,
    (
        select
            count(*) RECORDS_T_E_T,
            avg (TOTAL_ELAPSED_TIME) AVG_T_E_T
        from
            SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
        where
            START_TIME between $start_time
            and dateadd(days, 1, $start_time)
            and WAREHOUSE_ID is not NULL
        group by
            WAREHOUSE_ID
        order by
            WAREHOUSE_ID
    ) as c;

But it’s returning 64 Rows here ( I think because there are 4 Warehouses ), anyway, I’ve tried using Join, also, same problem, there is some command that I don’t know yet?

If I try to use UNION, something like :

select
    a.*
from
    (
        select
            WAREHOUSE_NAME,
            TOTAL_ELAPSED_TIME,
            QUERY_TEXT,
            START_TIME,
            WAREHOUSE_ID
        from
            SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
        where
            TOTAL_ELAPSED_TIME in (
                select
                    max(TOTAL_ELAPSED_TIME)
                from
                    SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
                where
                    START_TIME between $start_time
                    and dateadd(days, 1, $start_time)
                    and WAREHOUSE_ID is not NULL
            )
        group by
            WAREHOUSE_ID
        UNION
        select
            max (NULLIF((QUEUED_OVERLOAD_TIME), 0)),
            avg (NULLIF((QUEUED_OVERLOAD_TIME), 0)),
            count (NULLIF((QUEUED_OVERLOAD_TIME), 0))
        from
            SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
        where
            START_TIME between $start_time
            and dateadd(days, 1, $start_time)
            and WAREHOUSE_ID is not NULL
        group by
            WAREHOUSE_ID
    ) a
order by
    WAREHOUSE_ID;

I get the error :

SQL Compilation error: invalid number of result Columns for set Operator input branches, expected 5, got 3 in branch {2}

I don’t have the same data types in both selects, I can’t use UNION.

  • 1

    Possible duplicate of SQL helps to join 2 selects

  • No, even trying so, only returns error, maybe it is pq the Snowflake does not accept any of these functions, in general it only accepts things from mysql.

  • As far as I know Snowflake does not accept this function, try using Mysql Workbench is more likely to get there by the variety of features

  • 1

    Your Union is wrong. Understand the concept before you apply it. I bet you will succeed!

  • I’ve been trying for 3 days, I used everything I could, in many ways, I couldn’t, so I came here to ask, the example you gave is not equivalent, I tried to apply it, the same problem anyway.

  • Play an example of the data and selects in db-fiddle, that I do Union for you.

Show 1 more comment

1 answer

3


Have you tried using WITH QUERIES to unite this data?

Would look like this:

WITH 
query1 AS 
(
   select
       WAREHOUSE_ID,
       WAREHOUSE_NAME,
       TOTAL_ELAPSED_TIME,
       QUERY_TEXT,
       START_TIME
   from
       SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
   where
       TOTAL_ELAPSED_TIME in (
           select
               max(TOTAL_ELAPSED_TIME)
           from
               SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
           where
               START_TIME BETWEEN $ start_time
               and dateadd(days, 1, $ start_time)
               and WAREHOUSE_ID is not NULL
           group by
               WAREHOUSE_ID
       )
),
query2 AS 
(    
   select
       WAREHOUSE_ID,
       max (NULLIF((QUEUED_OVERLOAD_TIME), 0)) MAX_Q_O_TIME,
       avg (NULLIF((QUEUED_OVERLOAD_TIME), 0)) AVG_Q_O_TIME,
       count (NULLIF((QUEUED_OVERLOAD_TIME), 0)) RECORDS_Q_O_TIME
   from
       SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
   where
       START_TIME between $start_time
       and dateadd(days, 1, $start_time)
       and WAREHOUSE_ID is not NULL
   group by
       WAREHOUSE_ID
),
query3 AS 
(  
   select
       WAREHOUSE_ID,
       count(*) RECORDS_T_E_T,
       avg (TOTAL_ELAPSED_TIME) AVG_T_E_T
   from
       SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
   where
       START_TIME between $ start_time
       and dateadd(days, 1, $ start_time)
       and WAREHOUSE_ID is not NULL
   group by
       WAREHOUSE_ID
)
SELECT 
   query1.WAREHOUSE_NAME,
   query1.TOTAL_ELAPSED_TIME,
   query1.QUERY_TEXT,
   query1.START_TIME,
   query2.MAX_Q_O_TIME,
   query2.AVG_Q_O_TIME,
   query2.RECORDS_Q_O_TIME,  
   query3.RECORDS_T_E_T,
   query3.AVG_T_E_T
FROM query1
INNER JOIN query2 ON query2.WAREHOUSE_ID = query1.WAREHOUSE_ID
INNER JOIN query3 ON query3.WAREHOUSE_ID = query1.WAREHOUSE_ID
ORDER BY 
       queyr1.WAREHOUSE_ID
  • No, nor knew it was possible, I will study about, but the result returned right, thank you! Just one note, you typed queyr in select instead of query'.

  • Really was wrong, thanks for warning. It’s already fixed.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.