Python - Transforming line ranges into columns

Asked

Viewed 3,941 times

0

I have a Dataframe with a column and 5,000 rows.

What I need to do is take 48-line intervals and transpose to 48 columns with the entire dataframe

How are you:

 1. 0.35
 2. 0.21
 3. 0.45
 4. ...
 5. 0.12
 6. 0.34 

How I need:

[0.35 0.21 0.45 ... 0.12 0.34]
  • And what have you tried to do? What was the result obtained?

1 answer

1

I made a code that can be changed to various amounts of rows to be transposed into columns. First, I created a DataFrame with 40 elements (1, 2, ..., 40) in a column and stored its dimensions:

import pandas as pd
dados = list(range(1, 41, 1))
dados= pd.DataFrame(dados)
nlinhas,ncols=dados.shape

Result of print(dados.head()):

inserir a descrição da imagem aqui

Then set how many columns the DataFrame with the data transposed should have. In my example, I chose 6, in yours would be 48. Based on the number of elements (number of rows) and number of columns, I create a list of the initial indices of each row in the new DataFrame.

# número de colunas no DataFrame a ser criado (transposto):
ncolunas = 6

lista = list(range(0, nlinhas,ncolunas))

The DataFrame.append() joins the data based on the column name. Therefore, in order for the data to be aligned, it must always have columns with the same name. A DataFrame empty is created to store transposed data and columns are named (C0, C1, ..., C5):

colnomes=["C"+str(i) for i in range(0, ncolunas)]
dados_transpostos = pd.DataFrame(columns=colnomes)

Finally a for loop iterates over the values of the list of indexes, transposes the columns of interest, checks if there is the expected amount of columns - if there are none, adds -, renames the columns and appends to the DataFrame created to store transposed data.

for i in lista:
    df=dados[i:i+ncolunas].transpose()
    if df.shape[1] != len(colnomes):
        addcol = len(colnomes)-df.shape[1]
        df = pd.concat([df, pd.DataFrame(columns=['B'+str(i) for i in range(0, addcol)])])
    df.columns = dados_transpostos.columns.values
    dados_transpostos= dados_transpostos.append(df, ignore_index = True)

print(dados_transpostos)

Upshot:

inserir a descrição da imagem aqui

  • Thank you, Marina. In my code, he was add everything in a single line, but with his example helped me a lot.

  • If I correctly imagine your code, at each iteration 48 lines were selected and transposed. When transposing, the "name" (index/index) of the line became 0. If every time you transpose, the index is the same, the new columns will only contain elements in row 0. In the code I proposed, the term ignore_index = True causes each line to be renamed when included in DataFrame. If this term is omitted, all rows are set to 0. Since columns have the same name, data is added to the created columns. If you did not name them, the data would be on a single line

Browser other questions tagged

You are not signed in. Login or sign up in order to post.