How do I create a variable with some columns of my data frame?

Asked

Viewed 41 times

1

I have a data frame with 10 columns and wanted to extract 4 of these columns for a variable. I even tried to make a list and use the list but the code was not clean, I know there are ways to extract but I’m not remembering how, the way I’m trying to give error.

variável = dados["1", "2", "3", "4"]

speaking in a generic way that’s what I tried.

1 answer

4

A Pandas dataframe allows you to create a "sub dataframe" - a view of the original dataframe, passing a list of the desired column names as an item within the brackets.

That is, if your dataframe is in the variable df and you want a separate variable only with the columns "name" and "address" just do: variavel = df[["nome", "endereco"]].

The object returned by this operation is itself a dataframe, with all the methods and functionalities that a dataframe has - but depending on the situation, the data in the nvo dataframe may be just a view of the original dataframe, or a stand-alone copy. When in doubt, if you make any changes to the data in the new dataframe, it is better to make a copy with the method .copy(), to make sure that the df original will not be changed.

Here, a complete example in the interactive interport of a dataframe’s column selection:


In [1]: import pandas as pd

In [2]: df = pd.DataFrame([(1,2,3,4)] * 4, columns=["col1", "col2", "col3", "col4"])

In [3]: df
Out[3]: 
   col1  col2  col3  col4
0     1     2     3     4
1     1     2     3     4
2     1     2     3     4
3     1     2     3     4

In [4]: recorte = df[["col2", "col3"]]

In [5]: recorte
Out[5]: 
   col2  col3
0     2     3
1     2     3
2     2     3
3     2     3

Browser other questions tagged

You are not signed in. Login or sign up in order to post.