How to make a cumulative sum in Bigquery?

Asked

Viewed 282 times

0

Hello. I need to create a Dashboard from a query. I would like to generate this column in the query always adding up to the previous month, ie a cumulative sum.

ex: instead of:

mês 1: 2
mês 2: 3 
mês 3: 5
mês 4: 1

would be:

mês 1: 2
mês 2: 5
mês 3: 10
mês 4: 11

In my case:

SELECT count(id_produto) AS quantidade_ofertas, date_trunc(cast(datahora as date), month) as mes
FROM `TABELA1` 
WHERE DATE(process_time) < CURRENT_DATE()
and cast(datahora as date) < CURRENT_DATE()
group by date_trunc(cast(datahora as date), month)

Being that in this code above, generates a column with values for each month, and not the sum to the previous one (cumulative).

The idea of this query is to generate a Dashboard where I can filter by date without the 'accumulative' that the ferally provides, disturb.

Thank you in advance.

  • wouldn’t just be changing the count() for sum()?

  • No. The sum would add up the product ids. What I tried to do was with: SELECT Count(id_product) AS quantity_ofertas, date_trunc(cast(datahora as date), Month) as mes FROM TABELA1 &#xA;WHERE DATE(process_time) < CURRENT_DATE()&#xA;and cast(datahora as date) < CURRENT_DATE()&#xA;group by date_trunc(cast(datahora as date), month)&#xA;SELECT sum(quantidade_ofertas) as cumulativo, mes&#xA;from TAB1&#xA;group by quantidade_ofertas, mes&#xA;order by mes asc&#xA;Only it gives the same result, because it adds up the same months. I want it gone the month before.

  • I could do function composition, I’m without bigquery to test, but then you would do a sum(Count(id_product))

  • Retorno: Aggregations of aggregations are not allowed at [1:8]

  • It looks like you want to use a window Function: https://cloud.google.com/dataprep/docs/html/Window-Functions_57344749

1 answer

0

I think for your problem, you can divide it into 2 parts, first using groupby to find the ids in each month, and then using over to make the cumulative sum. Follow an example

with t1 as (
  select '2020-01-01' as data, 0101 as id_produto
  union all
  select '2020-01-01', 0102
  union all
  select '2020-02-01', 0201
  union all
  select '2020-02-01', 0202
  union all
  select '2020-02-01', 0203
  union all
  select '2020-03-01', 0301
  union all
  select '2020-03-01', 0302
  union all
  select '2020-03-01', 0303
),

t2 as (
  select 
    date_trunc(cast(data as date),  month) as data,
    count(id_produto) as id_produto
  from t1 
  group by 1
)

select
  data,
  sum(id_produto) over (order by data rows between unbounded preceding and current row)
from t2
--saida:
Linha   data       f0_  
1       2020-01-01 2
2       2020-02-01 5
3       2020-03-01 8

Browser other questions tagged

You are not signed in. Login or sign up in order to post.