Question about list(zip) command and converting dictionaries to dataframes?

Asked

Viewed 59 times

1

I am filling some lists with strings and other numeric entries and using them to fill some dictionaries with information that I need to view alternately. Then I came to doubt, the command list(zip()) creates several dictionaries associated with a variable that I can access?

Consequently, can I associate a variable with every combination of lists I want? It is possible to convert these dictionaries into dataframes?

A=[]
B=[]
C=[]
list(zip(A,B))
list(zip(B,C))
list(zip(A,C))
list(zip(A,B,C))
  • 2

    Only one detail: according to documentation, zip does not return a dictionary, but rather an iterator of tuples. So if you do zip(A, B), it returns an iterator which, at each iteration, returns a tuple containing an element of A and another of B. In doing list(zip(A, B)), you have a list of all these tuples (see here an example). What pandas does is use these tuples to create the dataframe, and doesn’t even need to list, can use only zip straightforward, see

1 answer

1


Yes. It is possible to convert dictionaries into DataFrames. In fact, a dictionary with strings like keys and values like list is a ready-made form of a pandas DataFrame. With a dictionary in the cited format, just use the constructor pd.DataFrame to create the dataframe. Example:

import pandas as pd

data_dict={"Name":["Walter", "Saul", "Hank", "Sjyler"], "Age":[54,56,48,42], "Sex":["M","M","M","F"]}

pd.DataFrame(data_dict)

Output:

     Name  Age Sex
0  Walter   54   M
1    Saul   56   M
2    Hank   48   M
3  Sjyler   42   F

As you noticed, it is also possible to do this using the combination of constructor list (list()) with the General zip. However, in this case, you need to spell out the name of the columns. Example:

names=["Walter", "Saul", "Hank", "Sjyler"]
age=[54,56,48,42]                                                                                                                                                                                   
sex=["M","M","M","F"]                                                                                                                                                                               

pd.DataFrame(list(zip(names,age,sex)), columns=["Name", "Age", "Sex"])

Ouput:

     Name  Age Sex
0  Walter   54   M
1    Saul   56   M
2    Hank   48   M
3  Sjyler   42   F

If you do not explain the column name, the command also works, but the column name will be the default: [1,2,3].

Additional information about the function zip. This function applies a kind of distributive in the lists to create a series tuples. It belongs to the class of generators which are objects used for iteration, but which, unlike other objects used for iteration, such as list comprehensions, do not generate a result. By not producing output, these objects use less memory. In your specific case, you only saw the result of the function zip why did you use the list constructor.

All this to say that, although it is possible, the construction of dataframes from zip functions is perhaps not the most common procedure. Using dictionaries seems more natural and simpler.

  • Let me take one last question, in this case, the command list(zip()) to unify the lists, I am restricted to only display them according to the order I chose in the list layout. E when using pd.Dataframe(list(zip(...)) can I only display the Dataframe? I am asking this because I noticed that it does not show the generated Dataframe in the variable console. but is it possible to convert it to a variable that I can manipulate? As the lists I insert manually, it makes more sense to use the first method, but only to take advantage of the doubt

  • 1

    In fact, in this case you generate the DataFrame similarly, with the difference that the column names will be 1, 2, 3. I will edit to make clear that the argument columns is optional

Browser other questions tagged

You are not signed in. Login or sign up in order to post.