Consultation in public database

Asked

Viewed 289 times

-1

I need to perform queries in an online database, and turn the returned data into a data frame.

I used an existing example on the database website, but I have no idea how to turn it into a database.

The example is similar to this sample:

import urllib
url = 'http://dados.cvm.gov.br/api/action/datastore_search?resource_id=92741280-58fc-446b-b436-931faaca4fb4&limit=5&q=_id:01'
fileobj = urllib.request.urlopen(url)
read_file = fileobj.read()
print (read_file)

And as a result, I got this:

b'{"help": "http://dados.cvm.gov.br/api/3/action/help_show?name=datastore_search", "Success": true, "result": {"resource_id": "92741280-58fc-446b-b436-931faaca4fb4", "Fields": [{"type": "int4", "id": "_id"}, {"type": "text", "id": "CNPJ_FUNDO"}, {"type": "timestamp", "id": "DT_COMPTC"}, {"type": "Numeric", "id": "VL_TOTAL"}, {"type": "Numeric", "id": "VL_QUOTA"}, {"type": "Numeric", "id": "VL_PATRIM_LIQ"}, {"type": "Numeric", "id": "CAPTC_DIA"}, {"type": "Numeric", "id": "RESG_DIA"}, {"type": "Numeric", "id": "NR_COTST"}, {"type": "int8", "id": "_full_count"}, {"type": "float4", "id": "rank"}], "q": "_id=01", "Records": [], "_links": {"start": "/api/action/datastore_search? q=_id%3D01&limit=5&resource_id=92741280-58fc-446b-b436-931faaca4fb4", "next": "/api/action/datastore_search? q=_id%3D01&offset=5&limit=5&resource_id=92741280-58fc-446b-b436-931faaca4fb4"}, "limit": 5}}'

How can I turn this result into a dataframe?

Website of the source: http://dados.cvm.gov.br/dataset/fi-doc-inf_diario/resource/92741280-58fc-446b-b436-931faaca4fb4#embed-f1e82110-9d99-4e9b-9789-6fee7c3efa03

If useful, from what I read on the source website, the data is made available using CKAN.

  • See if that answer here help you.

  • I obtained result, however, it was not the desired.

  • Enter the code you developed, the result you got and explain the difference (or what is missing) to what you want.

1 answer

0

If you see one dataframe as a table and as a Dictionary in Python you will notice that what you have as data, by what you have shown, is a set of tables, or a table of tables. So unless you want to represent this data as a table with Dicties inside - which is not very useful -, you need to define which tables you would like to extract from this table of tables, and how to divide and organize them. With this planning in mind, you can create a script to access the Dicties or tables of each part of that code and, for each of them, create a dataframe suitable. The pandas allows you to create a dataframe from a Dictionary in Python easily using Pandas.DataFrame(seu_dictionary).

Browser other questions tagged

You are not signed in. Login or sign up in order to post.