Problem with LEFT JOIN using Pentaho Kettle

Asked

Viewed 541 times

5

There is a behavior with respect to JOIN that I cannot understand.

I have two tables in these formats

Table Doente

 ID_Doente Doente 
 1          Pedro
 2          Paulo
 3          Rui

Table query

 ID_Consulta  ID_Doente Tipo
 1            2        Não Medica
 2            2        Médica

And I’m doing a left Join like this:

SELECT *
FROM Consulta
LEFT JOIN Doente
ON Consulta.ID_Doente =Doente.ID_Doente;

I was under the impression that the result would be something like this

ID_Consulta  ID_Doente    Tipo         Doente
 1             2        Não Medica    Paulo
 2             2        Médica        Paulo

But yet I’m having this

ID_Consulta  ID_Doente    Tipo         Doente
 1             2        Não Medica    Paulo
 2             2        Médica        NULL

Does anyone know the reason?

EDIT: This is an example of what’s going on, I’m doing this using the Pentaho Kettle’s merge left Join feature

EDIT2: Question has been answered here https://stackoverflow.com/questions/22459143/pentaho-kettle-weird-left-join-results

  • 3

    What database are you using? I couldn’t play in Mysql nor in Postgresql nor in SQL Server nor in the Sqlite.

  • Yeah, maybe that’s the problem, I’m performing this left Join on the tables in the Pentaho Kettle.

  • I don’t see how it is possible to have this kind of resultset given the table structure and queries. Both seem ok. You tried to run the query directly in the database to see what happens?

  • It’s not possible because I’m using Kettle as an intermediary to perform joins on Mongodb, which, if I’m not mistaken, can’t stand joins. But anyway use Mongodb should not be the problem because Kettle puts the data in SQL tables after some transformations.

  • What happens if you make one FULL JOIN or CROSS JOIN? I don’t think so, but maybe the problem is in some of the previous stages of transformation that Pentaho does in its data before crossing them. (I don’t have much experience with Pentaho, so I’m sorry if I’m talking nonsense...)

  • I’m testing full Join right now, but I doubt that the problem is in the previous steps, because for example in this example it does Join in one of the cases with ID 40 but in the others for some reason it does not http://i.imgur.com/Vqbf2e.png

  • I made a short test in Sql Server and the result is as expected. Is it because you are not aliasing the tables ?

  • At least in Mysql this works well. Try to mount an sqlfiddle that plays the problem. http://sqlfiddle.com/#! 2/1e04d/1

  • @user3323032 I also want to remind that, unless there are actually consultations without a patient, there is no reason to use LEFT JOIN. The LEFT JOIN is specific to special cases where you want to list relationship between tables where not all indexes cross match.

  • It seems that the question has already been answered in SOE (link). The problem was that the questioner forgot to sort the data before making a merge Join.

Show 5 more comments

2 answers

2

I tested your SQL on Sqlserver and it worked normally:

inserir a descrição da imagem aqui

The problem must be in the engine of the Pentaho Kettle.

The following code is the same, but try to see if it works, because it must be a structural problem in Pentaho.

SELECT *
FROM Doente 
RIGHT JOIN Consulta
ON Consulta.ID_Doente =Doente.ID_Doente;

0

Hi, how are you? I believe you may be using a Join without adding the Sort component or unexpected results may come out.

See what quotes the Kettle documentation. Documentação Pentaho

When using a JOIN in Pentaho add a Step Sort with the Ids link so that the Pentaho can classify the results before joining, so it will be able to classify without major problems

Step Join com Sort

I hope I’ve helped.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.