How to make a dynamic table in a Pandas dataframe?

Asked

Viewed 1,271 times

1

Good afternoon. I’m migrating some activities from excel to python (an environment I’m still a layman), and came across a problem in making a dynamic table. I tried using pivot_table but could not get the result I wanted.

Basically I’m reading a csv of 2 columns (which I get in another code) and I want to make a dynamic table in both columns and create a third, generating a new dataframe.

The csv file has the "VLAN" and "Interface" columns. I want to place in 1 column all the unique appearances of VLANS associated with its Interface (column 2), in the third column how many appearances were found. link csv: https://drive.google.com/open?id=1OqbqcU5lidRtvL73cptct-oj4CMYNrhp

Excerpt from csv (contains 700 lines):

Interface   VLAN 
xe-0/0/0:2  2690 
xe-0/1/2    2691
xe-0/0/0:2  2690
xe-0/0/0:2  2690
xe-0/1/2    2691

I can count the separate appearances of VLAN and Interface, but not together.

import pandas as pd
data = pd.read_csv("MCO.csv",sep=";")
df = pd.DataFrame(data)

print(df["VLAN"].value_counts())
print(df["Interface"].value_counts())
df2 = pd.pivot_table(df,index="VLAN")
print(df2)

error output at pivot_table line:

raise DataError("No numeric types to aggregate") pandas.core.base.DataError: No numeric types to aggregate

The final result I want to arrive would be this dataframe:

VLAN    Interface   Contagem_VLAN
2689    xe-0/0/0:2  173
2690    xe-0/0/0:2  287
2691    xe-0/1/2    315

Edit 1: I managed to solve it the way I did, but it doesn’t sound very smart. I created a support column with only "1" to be able to do the sum, I don’t know if this is really necessary.

import pandas as pd
import numpy as np

data = pd.read_csv("MCO.csv",sep=";")
df = pd.DataFrame(data)
df = df.astype(str)

coluna_1 = pd.Series(1,index=np.arange(len(df)))
df.insert(loc=2,column="coluna_1",value=coluna_1)
df2 = pd.pivot_table(df,index=["VLAN","Interface"], aggfunc="sum")
print(df2)

output:

                 coluna_1
VLAN Interface           
2689 xe-0/0/0:2       173
2690 xe-0/0/0:2       287
2691 xe-0/1/2         315

1 answer

1


You can make that count through a groupby:

df.groupby(['VLAN','Interface']).size().reset_index().rename(columns={0:'contagem'})
  • Wow, extremely simple. I’ve come so far to get something like this. Thank you very much drec4s. I’m still without reputation to vote, but that line solved my problem.

  • Glad it helped! You don’t have to vote, just mark as valid ;)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.