How to create dataframe in pandas from series with dictionaries?

Asked

Viewed 985 times

0

In Python3 and pandas I have a series with lists. In each row of the series there is a list, with dictionaries inside. It was obtained from a file:

import pandas as pd

geral = pd.read_csv("mandados_12_abr_2018_RJ.csv",sep=';',encoding = 'latin_1')

geral.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5298 entries, 0 to 5297
Data columns (total 4 columns):
mandados     5298 non-null object
mensagem     0 non-null float64
paginador    5298 non-null object
sucesso      5298 non-null bool
dtypes: bool(1), float64(1), object(2)
memory usage: 129.4+ KB

mandados = geral['mandados']

mandados.reset_index().head()
index   mandados
0   0   [{'id': 409, 'numeroMandado': '2251-65.2012.8....
1   1   [{'id': 486, 'numeroMandado': '358208-13.2011....
2   2   [{'id': 100, 'numeroMandado': '2274-09.2012.8....
3   3   [{'id': 1676, 'numeroMandado': '26782-22.2012....
4   4   [{'id': 1973, 'numeroMandado': '1664656-97.201...

Example of a line content:

   [{'nomeParte': 'ANDRE LUIZ DE ALMEIDA', 'orgao': 'TJRJ', 
    'numeroMandado': '450429-49.2010.8.19.0001.0002', 'dataMandado': 
'2011-04-25', 'situacao': 'Aguardando Cumprimento', 'id': 4488922, 
'detalhes': ['Sexo: Masculino', 'Nome do Genitor: Jorge Carlos De 
Almeida', 'Nome da Genitora: Maria Alice Menezes', 'Nacionalidade: 
Brasileira', 'Data de nascimento: 15/11/1974', 'Carteira de identidade: 
099009730']}], 'paginador': {'mostrarProximaPagina': True, 'ultimaPagina': 
5280, 'mostrarPaginaAnterior': True, 'paginaAtual': 5278, 
'registrosPorPagina': 10, 'totalPaginas': 5300, 'primeiraPagina': 5276, 
'mostrarPaginador': True, 'totalRegistros': 52998}, 'mensagem': None}]

I want to create a dataframe with the series items in each row, which would be the dataframe columns:

nomeParte, orgao, numeroMandado, dataMandado, situacao e detalhes

It is possible to do this?

  • 1

    What are the 'keys' of the details you want? They are not constant between the dicts

  • Thank you. Genitor’s name, Genitora’s name, Date of birth and ID card

  • Hello Reinaldo, I believe the problem came from the root, I only realized what you really wanted when I saw this question. I will delete my answer here, I think you will be able to delete this question because I think you will not have this doubt once it is solved from scratch which is the best solution: https://answall.com/questions/291032/em-raspagens-grandes-howto avoid connectionerror/291040#291040

1 answer

0

Using the example you gave me:

lista = [{'nomeParte': 'ANDRE LUIZ DE ALMEIDA',
      'orgao': 'TJRJ', 
      'numeroMandado': '450429-49.2010.8.19.0001.0002',
      'dataMandado': '2011-04-25', 'situacao': 'Aguardando Cumprimento', 'id':4488922, 
      'detalhes': ['Sexo: Masculino',
                   'Nome do Genitor: Jorge Carlos De Almeida',
                   'Nome da Genitora: Maria Alice Menezes', 
                   'Nacionalidade: Brasileira',
                   'Data de nascimento: 15/11/1974',
                   'Carteira de identidade: 099009730']}]

To turn this into a dataframe, just use the pd.Dataframe command (list)

import pandas as pd
b = pd.DataFrame(lista)

This command transforms your dictionary list into columns and rows.

inserir a descrição da imagem aqui

Browser other questions tagged

You are not signed in. Login or sign up in order to post.