How to turn list into array

Asked

Viewed 1,413 times

1

I have a csv which contains 2 columns one with a list of 200 numbers (0.3,0.4,08...) and the other a name. This file has 50 rows of this type. I would like to turn this list of number lists into an array at the end with a size of 50x200.

I tried it but it didn’t work :

dataset = pd.read_csv('exemplo.csv', sep=',')  
n=[]
p=[]
for row in dataset["nome"]:
    print (row) 
    p.append(row) 
for row in dataset["numeros"]:
    print (row) 
    n.append(row)
nome = np.array(p)
numeros = np.array(n)

2 answers

1

EDIT: The previous version of this answer used the pandas' exploding method. It is also a possible solution, but the solution below, using to_list, is much simpler

You can put these lists in a dataframe and use the to_list method of pandas. Here’s a replicable example:

import string
from random import choice

my_dict = {'lista_{}'.format(k): list(range(0,200)) for k in range(50)} #dicionário contendo 50 listas com números de 0 a 200

numeros = [my_dict['lista_{}'.format(m)] for m in range(50)] #variável numeros

alpha = string.ascii_lowercase

nomes = [choice(alpha)+choice(alpha)+choice(alpha)+choice(alpha) for m in range(50)] #variável nome (4 letras aleatórias)

df = pd.DataFrame({'Nomes': nomes, 'numeros': numeros}) #passando para um dataframe

new_columns = [choice(alpha)+choice(alpha)+choice(alpha) for j in range(200)] #criando uma lista com nomes das novas colunas que vamos criar (4 letras aleatórias)

df[new_columns] = pd.DataFrame(df.numeros.values.tolist()) #expandindo as listas

df

Output:

    Nomes   sij doz rwg ubn wbk bxp qzr wsz vmz ... iky crq sdh dbb oqq rnq tib rek ygj tao
0   ljmr    0   1   2   3   4   5   6   7   8   ... 190 191 192 193 194 195 196 197 198 199
1   clay    0   1   2   3   4   5   6   7   8   ... 190 191 192 193 194 195 196 197 198 199
2   rbue    0   1   2   3   4   5   6   7   8   ... 190 191 192 193 194 195 196 197 198 199
3   zlsa    0   1   2   3   4   5   6   7   8   ... 190 191 192 193 194 195 196 197 198 199
4   aetx    0   1   2   3   4   5   6   7   8   ... 190 191 192 193 194 195 196 197 198 199
5   pgav    0   1   2   3   4   5   6   7   8   ... 190 191 192 193 194 195 196 197 198 199
6   cxgb    0   1   2   3   4   5   6   7   8   ... 190 191 192 193 194 195 196 197 198 199
7   wcpg    0   1   2   3   4   5   6   7   8   ... 190 191 192 193 194 195 196 197 198 199
8   fadw    0   1   2   3   4   5   6   7   8   ... 190 191 192 193 194 195 196 197 198 199
9   sqzo    0   1   2   3   4   5   6   7   8   ... 190 191 192 193 194 195 196 197 198 199
10  hysc    0   1   2   3   4   5   6   7   8   ... 190 191 192 193 194 195 196 197 198 199
11  fqnp    0   1   2   3   4   5   6   7   8   ... 190 191 192 193 194 195 196 197 198 199

Confirming the desired Shape:

df.iloc[:,1:].shape

Output:

50x200

1


Given a Data Frame, for example:

import pandas as pd
import numpy as np

df = pd.DataFrame(data = {'listas':[[1,2],[3,4],[5,6]],'nomes':['nome 1','nome 2','nome 3']})

>>> df
   listas   nomes
0  [1, 2]  nome 1
1  [3, 4]  nome 2
2  [5, 6]  nome 3

We can take the "lists" column as an array as follows:

>>> df['listas'].apply(pd.Series).values
array([[1 2]
       [3 4]
       [5 6]], dtype=int64)

What I did was break the column with the lists in several columns, each with a list value, with the .apply(pd.Series) and then take the values with the .values.

Or do straight: df['listas'].tolist() will return an array, so just turn it into an array with np.array(df['listas'].tolist()).

Browser other questions tagged

You are not signed in. Login or sign up in order to post.