SELECT DISTINCT in JSON column in postgresql

Asked

Viewed 532 times

0

I have a table that has 2 columns, id (int auto_increment) and Resp (json).

In this table I have duplicate values in the column Resp, something like that:

id | resp
--------------------------------------------
1  | {"nome": "Jonh Doe", "idade": 20} 
2  | {"nome": "Jonh Doe", "idade": 20}
3  | {"nome": "Maria Claire", "idade": 38} 
4  | {"nome": "James Connor", "idade": 50} 

I’d just need to make a famous and traditional SELECT DISTINCT resp FROM tabela, however on account of the column resp be the type JSON, I cannot do DISTINCT, Postgresql returns the following message:

ERROR:  could not identify an equality operator for type json

I have already researched some solutions and tried the following ones, but they did not work:

Attempt 1:

SELECT
  DISTINCT field
FROM (
  SELECT jsonb_object_keys(resp) AS field
  FROM public.tabela
    ) AS subquery
--------------
ERROR: function jsonb_object_keys(json) does not exist

Attempt 2:

SELECT
  DISTINCT field
FROM (
  SELECT jsonb_array_elements(resp) AS field
  FROM public.tabela
    ) AS subquery
--------------
ERROR: function jsonb_array_elements(json) does not exist

Attempt 3:

SELECT
  DISTINCT field
FROM (
  SELECT json_array_elements(resp) AS field
  FROM public.tabela
    ) AS subquery
--------------
ERROR: could not identify an equality operator for type json

1 answer

1


In the postgresql you can YES group by columns of the json, you only need to specify the columns that should serve for grouping. In this post (en) you can see how; your consultation would look like this:

select val->>'nome' as field, val->>'idade' as age, count(*) as total 
from tabelateste 
group by val->>'nome', val->>'idade'

Look here one fiddle working.

  • It really worked @rLinhares, but what if I don’t have the same items in JSON? There are JSON that have 5 items and others that have 20 items. Example: "name, age, sex, dating, rg, Cpf" and other JSON that have only "name, age"

  • Either way you will specify the fields.. if enough nome and idade, He’ll compare just for them.. if they do not exist, I believe they should be considered as different, that what happens in comparisons that do not exist (to test it is only to go there in the fiddle and delete the age of a "John Doe", for example);

  • @Did Watson work out? He managed to solve??

  • 1

    I did a test with some fields and until then it worked, so less I will try to do with all possible fields of JSON and return here with a response. I appreciate your willingness to assist me.

  • @D.Watson any feedback? the ideal is that the question does not go unanswered (accept). If you need some improvement, gives a hello..

  • 1

    Hello @rLinhares I apologize for not answering before. I needed to do as follows: I created a SELECT with a ROW_NUMBER() OVER() and I partitioned for about 15 obligatory columns, so I could identify the distinct cases of the repeated ones, knowing what they are (in this case, it would be the cases with ROW_NUMBER = 1) I inserted these in another auxiliary table, I dropped the main and created again and turned the data back to main. That way I was able to filter only the distinct ones. But for ROW_NUMBER() I needed to indicate column by column as you put in the answer.

Show 1 more comment

Browser other questions tagged

You are not signed in. Login or sign up in order to post.