Looking for a group of duplicate records

Asked

Viewed 75 times

2

I have the table below:

declare @order table (idOrder int, altcode varchar(17) )

insert into @order values (1,'11111111111')   
insert into @order values (1,'22222222222')    
insert into @order values (1,'33333333333')  
insert into @order values (1,'44444444444')  

insert into @order values (2,'72374949491')  
insert into @order values (2, '11111111111')  
insert into @order values (2,'66194701273')   
insert into @order values (2,'22222222222')  
insert into @order values (2,'33333333333') 

insert into @order values (3,'87548758844')  
insert into @order values (3, '11111111111')  
insert into @order values (3,'734821919192')  
insert into @order values (3,'22222222222')  
insert into @order values (3,'33333333333')       
insert into @order values (3,'74912817837')  
insert into @order values (3,'67198191782') 

insert into @order values (4,'77777777777')  
insert into @order values (4, '11111111111')  
insert into @order values (4,'771904749189')  
insert into @order values (4,'33333333333')  
insert into @order values (4,'88888888888')    
insert into @order values (4,'99999999999')  

insert into @order values (5,'88888888888')  
insert into @order values (5,'781843741091')  
insert into @order values (5,'33333333333')  
insert into @order values (5,'99999999999')  
insert into @order values (5,'77777777777')      

I need to write a query to find all the orders and altcodes that any combination of 3 altcodes repeat in different orders.

For example: A query that works with the above table should find this:

O grupo de altcodes:
'11111111111'
'22222222222'
'33333333333'
são encontrados nas orders: 1, 2, 3

O grupo de altcodes:
'77777777777'
'88888888888'
'99999999999'
são encontrados nas orders: 4,5 

O grupo de altcodes:
'33333333333'
'88888888888'
'99999999999'
são encontrados nas orders: 4,5 

O grupo de altcodes:
'33333333333',
'77777777777',
'88888888888'
são encontrados nas orders: 4,5 

O grupo de altcodes:
'33333333333',
'77777777777',
'99999999999'
são encontrados nas orders: 4,5 

I’m having a hard time thinking of a query or algorithm with a cursor to solve this. I think there must be some easy command to solve this, but I can’t see it.

  • 1

    Welcome to the Stackoverflow in Portuguese. As the name suggests, the official language used here is Portuguese. So, could you please translate your question? If you prefer, you can also ask the same question on Stackoverflow website in English.

2 answers

3


You can make a query which counts occurrences in the table where the reference is within the proposed values, using the clause COUNT. After that just use the clause HAVING to check if the number of occurrences is equal to the number of references, for example:

SELECT o.idOrder
  FROM @order o
 WHERE o.altcode IN ('77777777777', '88888888888', '99999999999') 
 GROUP BY o.idOrder
HAVING COUNT(1) = 3

Which will result in 4 and 5, or:

SELECT o.idOrder
  FROM @order o
 WHERE o.altcode IN ('11111111111', '22222222222', '33333333333') 
 GROUP BY o.idOrder
HAVING COUNT(1) = 3

Which will result in 1, 2 and 3.


In the example below I created the table ocorrencias to replace @order. Create the function based on query above:

CREATE FUNCTION buscar_ocorrencias(@altcode1 VARCHAR(17),
                                   @altcode2 VARCHAR(17),
                                   @altcode3 VARCHAR(17))
RETURNS NVARCHAR(MAX)
AS
BEGIN
    DECLARE @lista NVARCHAR(MAX);

    SELECT @lista = ISNULL(@lista + ', ', '') +  CAST(o.idOrder AS VARCHAR)
      FROM ocorrencia o
     WHERE o.altcode IN (@altcode1, @altcode2, @altcode3) 
     GROUP BY o.idOrder
     HAVING COUNT(1) = 3;

    RETURN @lista;
END;

And then use it as follows:

WITH posicao AS (
  SELECT o.altcode,
         ROW_NUMBER() OVER(ORDER BY o.altcode) AS posicao
    FROM ocorrencia o
   GROUP BY o.altcode
), combinacao AS (
  SELECT p1.altcode AS altcode1,
         p2.altcode AS altcode2,
         p3.altcode AS altcode3,
         dbo.buscar_ocorrencias(p1.altcode, p2.altcode, p3.altcode) AS ocorrencias
    FROM posicao p1
   INNER JOIN posicao p2
      ON p2.posicao > p1.posicao
   INNER JOIN posicao p3
      ON p3.posicao > p2.posicao
)
SELECT c.altcode1,
       c.altcode2,
       c.altcode3,
       ocorrencias
  FROM combinacao c
 WHERE ocorrencias IS NOT NULL

At first cte we assign a position to each single record, while in the second we link each record to its subsequent ones, thus generating the combination (without repeating) of all lines. Within this combination we perform the consultation in the function cited above. With this result we just need to filter the non-zero results, that is, we have groups in common.

Observing: As of 2017 version of SQL Server you can use the function STRING_AGG instead of creating the function buscar_ocorrencias.

STRING_AGG

Concatenate the values of the string expressions and place the separator values between them. Separator is not added at the end of the string.

  • The altcodes are dynamic, I cannot specify them, they are thousands. The solution needs to be dynamic. The problem is just this, find any combination of 3 altcodes that repeat in different orders.

  • @skyzoserver you will not inform which you want to know the groups?

  • No, groups are dynamic. Any occurrence of 3 altcodes in different orders must be returned in the search, this is the difficulty.

  • @skyzoserver changed to work the way you asked, but I must warn you that generating combinations in the conditions you mentioned (thousands of altcodes) should be uncomplicated and will take some time for the query be resolved.

  • Sensational! You’re a myth! It was exactly what I needed!

  • I don’t think I can accept it because you’re saying I have no reputation...

Show 1 more comment

1

It is possible so, maybe it is faster and more practical:

SELECT idOrder, altcode from @ORDER where altcode in (
    SELECT ALTCODE FROM (
            SELECT COUNT(*) AS QT, ALTCODE FROM @ORDER
            GROUP BY ALTCODE 
            ) a WHERE QT > 1 )
    GROUP BY idorder, altcode
ORDER BY idorder
  • But there are the groupings of 3 in 3 that the OP requested

Browser other questions tagged

You are not signed in. Login or sign up in order to post.