Read a JSON file and print the data in tabular format

Asked

Viewed 730 times

1

The file is "grids.json", whose structure is:

    {"students": [{"name": "Alan", "lastname": "Silva", "exam1": 50, "exam2": 80, "exam3": 91},
    {"name": "Paula", "lastname": "Souza", "exam1": 95, "exam2": 98, "exam3": 99}]
    }

Objective: To read the file "grades.json" and to show the data in tabular format, including an additional column with the average of each student to the right of the exam grades and an additional line with the average of the class in each exam.

I tried to use pandas but can’t do anything but read the JSON:

import pandas as pd
import numpy as np

dt = pd.read_json("grades.json")
print(dt)

Any idea?

1 answer

2


As the data in your JSON are semi-structured, or format is not compatible with the formats returned by the method DataFrame.to_json(), what is indicated is to create the DataFrame with the function pandas.json_normalize() which is used to normalize semi-structured data.

To calculate the mean use the function numpy.average() and when creating a new column in DataFrame to be applied using the DataFrame.apply() applying a function on one of the axes of DataFrame, in the case applying by row using the columns as value.

import json # Necessário para decodificar o JSON
import pandas as pd
import numpy as np

#Abre o arquivo grades.json
with open('grades.json') as grades:    
  dados = json.load(grades) #Decodifica os dados

#Normaliza os dados contidos na chave students
dt = pd.json_normalize(dados,'students') 
#Cria uma nova coluna e calcula a média dos três exames
dt['média'] = dt.apply(lambda x: np.average([x['exam1'], x['exam2'] , x['exam3']]) , axis=1)

print(dt)

Resulting:

    name lastname  exam1  exam2  exam3      média
0   Alan    Silva     50     80     91  73.666667
1  Paula    Souza     95     98     99  97.333333

Running on Repl.it: https://repl.it/repls/CoordinatedSparseProcedures

  • What you mean by semi-structured?

  • 1

    It’s written in the first line of the answer: "Because the data in your JSON is semi-structured, or format is not compatible with the formats returned by the Dataframe.to_json method(),"

  • @ Augusto Vasques can help me in ? https://answall.com/questions/443657/extraindo-as-palavras-de-um-texto-longo-e-criando-estat%C3

Browser other questions tagged

You are not signed in. Login or sign up in order to post.