Select Distinct is bringing duplicate Mysql data

Asked

Viewed 721 times

0

I need to bring only the last record of each id_face that has in the table imagens, but for some reason it’s bringing id_faces duplicates.

Follow my consultation:

SELECT 
   DISTINCT(i.id_face) as id_face, 
   i.id_imagens as id_imagens, 
   date_format(i.dt_cad, '%d/%m/%Y \à\s %H:%i') as hora_visita, 
   df.nome, 
   df.email, 
   df.cpf,  
   df.celular, 
   date_format(df.data_nasc, '%d/%m/%Y') as idade, 
   CASE 
      WHEN df.idade_api < 15 THEN '14 ou -' 
      WHEN df.idade_api BETWEEN 15 AND 24 THEN '15 a 24' 
      WHEN df.idade_api BETWEEN 25 AND 34 THEN '25 a 34' 
      WHEN df.idade_api BETWEEN 35 AND 44 THEN '35 a 44' 
      WHEN df.idade_api BETWEEN 45 AND 54 THEN '45 a 54' 
      WHEN df.idade_api >= 55 THEN '55 ou +' END as idade_api, 
   CASE 
      WHEN df.genero_api ='f' THEN 'Feminino' ELSE 'Masculino' END as genero, 
   g.nome as grupo_nome, 
   l.nome_loja 
FROM imagens i 
JOIN dados_face df on i.id_face=df.id_face 
JOIN loja l on df.id_loja=l.id_loja 
LEFT JOIN grupo g on g.id_grupo = df.id_grupo 
JOIN iot on iot.id_user='1' WHERE i.dt_cad BETWEEN '2018-09-30 00:00:00' AND '2018-12-04 23:59:59.999999' 
AND df.id_grupo like '%' 
AND l.id_user='1' 
AND i.id_iot in (SELECT id_iot from iot where id_user='1') 
order by i.dt_cad DESC limit 0,24

This consultation brings me this:

inserir a descrição da imagem aqui

Note that the id_face 2355 repeats and so does everyone else. I tried to make a group by id_face but then it gets much worse, because the dates are displayed as disorderly as in the image below:

inserir a descrição da imagem aqui

Could someone assist me in correcting this query?
Below is the structure of the tables:

inserir a descrição da imagem aqui

Here’s another query I tried using MAX

select 
    distinct(df.id_face),
    MAX(i.id_imagens) as id_imagens, 
    MAX(date_format(i.dt_cad, '%d/%m/%Y \à\s %H:%i')) as hora_visita,
    df.nome,
    df.email,
    df.cpf,                                     
    df.celular,
    date_format(df.data_nasc, '%d/%m/%Y') as idade, 
    CASE WHEN df.idade_api < 15 THEN '14 ou -'
        WHEN df.idade_api BETWEEN 15 AND 24 THEN '15 a 24'
        WHEN df.idade_api BETWEEN 25 AND 34 THEN '25 a 34'
        WHEN df.idade_api BETWEEN 35 AND 44 THEN '35 a 44'
        WHEN df.idade_api BETWEEN 45 AND 54 THEN '45 a 54'
        WHEN df.idade_api >= 55 THEN '55 ou +'
    END as idade_api,                                   
    CASE WHEN df.genero_api ='f' THEN 'Feminino' ELSE 'Masculino' END as genero,
    g.nome as grupo_nome,
    loja.nome_loja
    from dados_face df
    join imagens i on i.id_face=df.id_face  
    LEFT JOIN grupo g on g.id_grupo = df.id_grupo
    JOIN iot on i.id_iot = iot.id_iot   
    JOIN loja on iot.id_loja =  loja.id_loja                                    
    where 
    i.dt_cad BETWEEN '".$data_incial.":00' AND '".$data_final.":59.999999'
    $id_grupo
    AND i.id_iot IN (select id_iot from iot where id_user='".$_SESSION['usuario']['id_user']."')
    AND df.id_loja IN (select id_loja from loja where id_user='".$_SESSION['usuario']['id_user']."')
  • 1

    I don’t use mysql, but usually distinct or group by will group tuples when they are completely equal. In the case of id 2355 some information of the two records are different as, for example, id_imagens and hora_visita.

  • I guess you had to use MAX().

  • I tried to use max and still not rolled I will show my query with max in the post ( I will edit )

  • Tried a subselect with MAX ?

3 answers

1

I need to bring only the last registration of each id_face that has on table images but for some reason is bringing duplicate id_faces

Just add to your where a clause that will limit the last record based on the field dt_cad and id_face table imagens.
Simply include the following restriction in your where :

AND i.dt_cad = (Select max(dt_cad) from imagens i2 where i2.id_face = i.id_face)  

Adapting your original query, your command SQL will be complete:

SELECT 
   DISTINCT(i.id_face) as id_face, 
   i.id_imagens as id_imagens, 
   date_format(i.dt_cad, '%d/%m/%Y \à\s %H:%i') as hora_visita, 
   df.nome, 
   df.email, 
   df.cpf,  
   df.celular, 
   date_format(df.data_nasc, '%d/%m/%Y') as idade, 
   CASE 
      WHEN df.idade_api < 15 THEN '14 ou -' 
      WHEN df.idade_api BETWEEN 15 AND 24 THEN '15 a 24' 
      WHEN df.idade_api BETWEEN 25 AND 34 THEN '25 a 34' 
      WHEN df.idade_api BETWEEN 35 AND 44 THEN '35 a 44' 
      WHEN df.idade_api BETWEEN 45 AND 54 THEN '45 a 54' 
      WHEN df.idade_api >= 55 THEN '55 ou +' END as idade_api, 
   CASE 
      WHEN df.genero_api ='f' THEN 'Feminino' ELSE 'Masculino' END as genero, 
   g.nome as grupo_nome, 
   l.nome_loja 
FROM imagens i 
JOIN dados_face df on i.id_face=df.id_face 
JOIN loja l on df.id_loja=l.id_loja 
LEFT JOIN grupo g on g.id_grupo = df.id_grupo 
JOIN iot on iot.id_user='1' WHERE i.dt_cad BETWEEN '2018-09-30 00:00:00' AND '2018-12-04 23:59:59.999999' 
AND df.id_grupo like '%' 
AND l.id_user='1' 
AND i.id_iot in (SELECT id_iot from iot where id_user='1') 
AND i.dt_cad = (Select max(dt_cad) from imagens i2 where i2.id_face = i.id_face)  
order by i.dt_cad DESC limit 0,24  

This way, for each collection record, the result will be limited to the maximum dt_cad (registration date) for each id_face table imagens.

The use of LEFT JOIN grupo g on g.id_grupo = df.id_grupo , check the possibility to exchange it only by Join and in this way may also withdraw the DISTINCT(i.id_face) of its original consultation.

About the use of DISTINCT

The DISTINCT in the original query of the question did not take effect, as this command depends on all fields displayed in the selection and not only the one placed in parentheses, thus to remove duplicate lines, it would be necessary for the contents of all fields to have the same values.
For example, consider a table called tabela_exemplo next:

campo1 peasant 2 peasant 3
1 1 1
1 2 1
1 1 2
1 1 1
2 1 1

When executing the query:

Select distinct(campo1), campo2, campo3 from tabela_exemplo  order by campo1;

Will be equivalent to using the consultation:

Select distinct * from tabela_exemplo order by campo1;

Whose result in both will be:

campo1 peasant 2 peasant 3
1 2 1
1 1 1
1 1 2
2 1 1

Now, consider in the selection only the first two fields, as in the following command:

select distinct(campo1), campo2 from tabela_exemplo order by campo1;

The result would be:

campo1 peasant 2
1 1
1 2
2 1

In this way, it is also clear why DISTINCT does not seem to have, worked. The values to be displayed in the fields need to be equal so that there is the deletion of one of the lines.

0

Distinct will not bring one line per id, so you will need to define which lines of that id you will bring and filter in the Where to come just one line, or, make a cursor. For example, if you only want one line per id with the line with the last visit date, you can use an Inner Join again with the images table (I1 and I2) where I1.id = I2.id and I1.dt_cad = max(I2.dt_cad), so it will only search for the highest last-access date line.

  • Edson thanks for the answer. I kind of understood what you meant but can you give me an ex from Inner Join? I couldn’t do it here. I appreciate

0

And if you do so:

SELECT 
   **i.id_face,** 
   i.id_imagens as id_imagens, 
   date_format(i.dt_cad, '%d/%m/%Y \à\s %H:%i') as hora_visita, 
   df.nome, 
   df.email, 
   df.cpf,  
   df.celular, 
   date_format(df.data_nasc, '%d/%m/%Y') as idade, 
   CASE 
      WHEN df.idade_api < 15 THEN '14 ou -' 
      WHEN df.idade_api BETWEEN 15 AND 24 THEN '15 a 24' 
      WHEN df.idade_api BETWEEN 25 AND 34 THEN '25 a 34' 
      WHEN df.idade_api BETWEEN 35 AND 44 THEN '35 a 44' 
      WHEN df.idade_api BETWEEN 45 AND 54 THEN '45 a 54' 
      WHEN df.idade_api >= 55 THEN '55 ou +' END as idade_api, 
   CASE 
      WHEN df.genero_api ='f' THEN 'Feminino' ELSE 'Masculino' END as genero, 
   g.nome as grupo_nome, 
   l.nome_loja 
FROM imagens i 
JOIN dados_face df on i.id_face=df.id_face 
JOIN loja l on df.id_loja=l.id_loja 
LEFT JOIN grupo g on g.id_grupo = df.id_grupo 
JOIN iot on iot.id_user='1' WHERE i.dt_cad BETWEEN '2018-09-30 00:00:00' AND '2018-12-04 23:59:59.999999' 
AND df.id_grupo like '%' 
AND l.id_user='1' 
AND i.id_iot in (SELECT id_iot from iot where id_user='1') 
**GROUP BY 
   i.id_face, 
   i.id_imagens, 
   date_format(i.dt_cad, '%d/%m/%Y \à\s %H:%i'), 
   df.nome, 
   df.email, 
   df.cpf,  
   df.celular, 
   date_format(df.data_nasc, '%d/%m/%Y'),    
   l.nome_loja** 
order by i.dt_cad DESC limit 0,24
  • still repeats: https://prnt.sc/lqki5c

Browser other questions tagged

You are not signed in. Login or sign up in order to post.