Create new DF based on a Pandas column

Asked

Viewed 1,291 times

1

I am new in Python and Pandas I have a DF that with 3 columns, as in the example below:

SRC    Data1   Data2
AAA     180     122
BBB     168     121
CCC     165     147
DDD     140     156
EEE     152     103
AAA     170     100
CCC     166     112
DDD     116     155
EEE     179     119

What I need is for a new DF to be created for each value that is in SRC, for example:

DF_A

SRC    Data1   Data2
AAA    180     122
AAA    170     100

DF_B

SRC    Data1   Data2
BBB     168     121

and so forth in all the values that there are in SRC, what I did was create a DF with the unique values of SRC

pd.DataFrame(DataFrameBase.SRC.unique())

but I don’t know if this is really gonna help me!

Thanks for your help!

1 answer

1


You can give the distinct in the field SRC with the command unique and after this assemble the dataFrames you need for example:

uniques = df.SRC.unique()
print(uniques)
dataFrames = []

for unique in uniques:
    data = df[df.SRC == unique]
    dataFrames.append(data)

print("Tamanho: " + str(len(dataFrames)) + '\n')

for data in dataFrames:
    print(data)
    print('\n')

After the distinct I go through each item by assembling the corresponding set and add it into an array.

But if you only need the SRC AAA and BBB for example you can do as follows:

df_a = df[df.SRC == 'AAA']
df_b = df[df.SRC == 'BBB']

print(df_a)
print('\n')
print(df_b)
  • Cara ta lindo! To close, how would I stop instead of print save new DF?

  • if you used the array, each position of the array has a dataframe, or if you used only the df_a and the df_b both are dataFrames, in the dataframe you can save them as csv or xlsx by command to_csv or to_excel for example: df_a.to_csv('df_a.csv') or df_b.to_excel('df_a.xlsx'), it will save the file in the same folder where the app is, but you can also pass the full path for example C:/.../file.cls

  • And how would I save the arrays? I tried the . to_csv but it didn’t work, I know I’m asking a lot, but I’m starting now with pandas.

  • imagine, the Voce array will need to do the foreach more before it is better to create an index to name the files: index = 0 You do what you do for data in dataFrames: now inside the for vc will go through all the dataFrames, they are stored in date, now Voce can do data.to_csv('dataFrame_' + str(index) + '.csv') after this line you need to increment the Dice to the next file index += 1

  • dude THANK YOU VERY MUCH

Browser other questions tagged

You are not signed in. Login or sign up in order to post.