Do not repeat records in a Join

Asked

Viewed 8,033 times

3

I have two tables: one with 1009 records and another 11949. When I do a Join, I have 11949 records, of which 1009 table records are repeated several times. I need to select with Join, but bring only the amount of records contained in the smallest table, because everything that exists in the smallest table exists in the largest. Down with my select:

select er.id_exm_rea 
from t_cmo_Exame_Realizado er 
inner join t_cmo_planilha_leitura pl on er.id_xfc = pl.id_xfc

t_cmo_worksheet -> Smaller table(1009 records)

t_cmo_Exame_Realized -> Larger Table(11949 records)

  • You want all the information that is on the table t_cmo_worksheet reading plus the information of qnts times it appears in the table t_cmo_Exame_Realizado, that’s it?

  • I’d just like to know why I return 11,000 records and not 1009?

  • Pq qnd vc performs the Join of one table with the other will always bring all the lines that were connected by Join and in your case, all 11949 records met the condition of your Join and so were returned.

3 answers

3

A simple distinct will solve:

SELECT DISTINCT ER.ID_EXM_REA FROM T_CMO_EXAME_REALIZADO ER     
INNER JOIN T_CMO_PLANILHA_LEITURA PL 
  ON ER.ID_XFC = PL.ID_XF

The keyword DISTINCT eliminates duplicate lines from the results of an instruction select.
If not specified distinct, all lines will be returned (including duplicates).

In your case, if you select all the id_exm_rea in t_cmo_Exame_Realizado without DISTINCT, 11949 lines will be returned.

  • I have already done with distinct. A larger amount is coming that should come. This select is to return records and use in an Insert. But I will walk some more. See I need 1009 records or less and no more.

  • It is important to note that the distinct depends on the fields in its select

  • I put another field on the Join ON and it seems to me that solved.

  • I would assemble an example indicating that also without a consistent Join, distinct does not work,but gave my time here at the firm.

  • Ismael, relax. Tomorrow, I know how it is. Anyway, I thank you all and I will continue.

2

Try this:

SELECT er.id_exm_rea
FROM t_cmo_Exame_Realizado er 
WHERE er.id_exm_rea IN (
    SELECT pl.id_xfc FROM t_cmo_planilha_leitura pl
)

1

You get more records than the amount present in the smallest table because of the following axiom (which I am deducing):

For each element present in the smaller table, there are N elements in the larger table, where N varies from zero to infinity.

That is, your data mass for your work has the minimum size dictated by the size of the smaller table, and the maximum dictated by the size of the larger table.

Any reduction in the mass of data you make so that your search has the smallest table size will likely result in data disposal.

Therefore, if you want to ensure that the search will not have more records than the smaller table, you should:

  • Consider only the smallest table in the search;
  • Discard data from the larger table, according to some criteria.

The clause distinct eliminates repetitions in the final search result. According to comments in Ishmael’s reply, you have already tried this, and yet your case has not been solved. Thus, I believe that its objective will not be achieved only with the amount of records of the smaller table. I think you need to review the requirement you’re meeting.

  • Renan, improving the Join, putting another field, solved at least apparently need to consist, but I think it solved.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.