Turn row into column

Question

Turn row into column

Asked 4 years, 9 months ago

Viewed 467 times

2

Good night, I’m having a problem of putting together more than 500 csv files, where I need a very important data that is in a row (A8), but it needs to turn column (like this in the second image), I’m trying to use the function iloc, but it’s not working, to join all the spreadsheets I am using this code, but before I need the dataframe to be like in the second photo, since I thank you for the help of all and for the suggestions:

path = r'C:/PJS_PY/HMR_DTS/DTS_BSE'

all_files = glob.glob(path + "/*.csv")

li = []

for filename in all_files:
    df = pd.read_csv(filename, sep=';', skiprows=range(0,8))
    li.append(df)

frame = pd.concat(li, axis=0, ignore_index=True)

I would suggest changing the title of the question slightly since, as it stands, it suggests that your problem is much more general than it really is. By the way, if someone comes here to know how to transpose a pandas dataframe, just do df.T

– Lucas

2020/11/09 at 13:20

1 answer

Browser other questions tagged python python-3.x pandas

You are not signed in. Login or sign up in order to post.

by Lucas • **3,858** points · Answer 1 · 2020-11-09T01:52:03+00:00

A possible solution is to open the file again and select only the information you want. With this information, you create a new column.

Since you didn’t present a replicable example, I created a dummy database that replicates your problem. So, suppose we have a csv file with that face:

k,19090
med,35000
gui, 23199
jkjd, 98082
jksjkjs, 890001
jksjkjs, 890001
gui, 23199
medidor, 18900005
name, idade, cor, sexo, income
Maria, 29, white, woman, 3000
João, 32, black, man, 2500

An implementation of this solution that I suggested would be:

import pandas as pd
import re

all_files= ["data.csv"]
li=[]

for filename in all_files:
        df = pd.read_csv(filename, sep=';', skiprows=range(0,8))
        f = open(filename, "r")
        for index, line in enumerate(f.readlines()):
            # pega especificamente a informação da linha 8:
            if index==7:
                # Como você quer só os números, 
                # aplique um regex para pegar só o código numérico
                numbers = re.findall(r"[\d+]", line)
                medidor_number = ''.join(numbers)
                medidor_var = [medidor_number]*len(df)
                df = pd.DataFrame({"Medidor": medidor_var}).join(df)
                #no seu caso, que tem vários arquivos,
                #faça o apensamento apenas aqui
                #li.append(df)
        f.close()

print(df)

Ouput:

    Medidor name, idade, cor, sexo, income
0  18900005  Maria, 29, white, woman, 3000
1  18900005     João, 32, black, man, 2500