Group by records by date and user id

Asked

Viewed 35 times

1

Good afternoon!

I have a table like this:

inserir a descrição da imagem aqui

I need to perform a SELECT that counts how many records I have per month and year, but grouping together the "repeated" records...(Records repeated in this case are the records as id 1, 2 and 3 that have the same user id and a time close to 30 seconds maximum, as these 3 records).

I was able to perform a SELECT, but I can’t remove those repeats.

SELECT count(id), YEAR(data_hora), MONTH(data_hora) FROM teste GROUP BY MONTH(data_hora), YEAR(data_hora)

inserir a descrição da imagem aqui

In this case, the records of the month 3 of 2021, should be grouped and instead of showing 3 records, should be only one...

I hope I have been able to explain, I accept suggestions to improve my question!

Ah, I’m using mariaDB 10.4.10

1 answer

2

Your query has a problem:

When doing the group per month/year only, will have the variation by id_user, so will not group that 3 as 1. IE, for the same month/year, will bring more than one.

Grouping by id_user will still bring more than one, as they are different id_user.

One solution is, first select the unrepeated records by id_user (here we will use a DISTINCT for this), along with month/year, without using the id as it is different for each record and will spoil the group, so:

SELECT distinct(id_user) id_user,
       YEAR(data_hora) ano, 
       MONTH(data_hora) mes 
  FROM teste;

This will return:

id_user ano     mes
2       2021    3
5       2021    1
6       2021    12
8       2021    2

Now that we have a single record per id_usuario/month/year, we can count and do the group by:

SELECT COUNT(id_user),
       ano,
       mes
 FROM
(SELECT distinct(id_user) id_user,
       YEAR(data_hora) ano, 
       MONTH(data_hora) mes 
  FROM teste) as sub
GROUP BY mes, ano;

You can see it working here: http://sqlfiddle.com/#! 9/3ca8a/8

To demonstrate a case that does not have in your example, I added another user in the same month/year, which will count with 2 in this case.

  • Thank you very much! Great answer, Ricardo. However, I still have a problem. Your SQL "groups" the same id_user if it occurs in the same month, and in fact, I can only group them together, when the difference between your schedules is 30 seconds. Example: if I have 3 user id 2 records, on the 15th, but their timestamps are different, 12/3/2021 10:00:00, 12/3/2021 10:00:05, 12/3/2021 09:00:00, in this case two should be grouped and one should not.

  • look 30 seconds will take a lot of work, need calculations... if it is an interim unit, for example days, hours or minutes, can add in the first query, as well as year and Month,e will work. Already 30 seconds may need a third query. You can try to convert tb to timestamp and see how much is the value of 30s and round, but as I said, need calculation

  • Good idea, I’ll try with 1 minute. Thank you very much, Ricado.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.