Merging two columns into a new column named 'Class' (dataframe Pandas)

Asked

Viewed 1,113 times

0

df_downsampled[df_downsampled['attack_cat']=="DoS"]

Take all the dataframe 'df_downsampled' where the column 'attack_cat' has the value Dos.

Dataset: https://www.unsw.adfa.edu.au/unsw-canberra-cyber/cybersecurity/ADFA-NB15-Datasets/

colunas = ['srcip','sport','dstip','dsport','proto','state','dur','sbytes', 'dbytes','sttl','dttl',
             'sloss','dloss','service','Sload','Dload','Spkts','Dpkts','swin','dwin','stcpb','dtcpb',

             'smeansz','dmeansz','trans_depth','res_bdy_len','Sjit','Djit','Stime','Ltime','Sintpkt',

             'Dintpkt','tcprtt','synack','ackdat','is_sm_ips_ports','ct_state_ttl','ct_flw_http_mthd',
             'is_ftp_login','ct_ftp_cmd','ct_srv_src','ct_srv_dst','ct_dst_ltm','ct_src_ltm','ct_src_dport_ltm',
             'ct_dst_sport_ltm','ct_dst_src_ltm','attack_cat','Label' ]

UNSW1 = pd.read_csv('/home/users/p02543/ddos/UNSW-NB15_1.csv',dtype={"srcip":object ,},names = colunas)

UNSW2= pd.read_csv('/home/users/p02543/ddos/UNSW-NB15_2.csv',dtype={"srcip":object ,},names = colunas)

UNSW3= pd.read_csv('/home/users/p02543/ddos/UNSW-NB15_3.csv',dtype={"srcip":object ,},names = colunas)

UNSW4= pd.read_csv('/home/users/p02543/ddos/UNSW-NB15_4.csv',dtype={"srcip":object ,},names = colunas)


UNSW = pd.concat([UNSW1,UNSW2,UNSW3,UNSW4])

  previsores = UNSW.iloc[:,UNSW.columns.isin(('Sload','Dload',
                                                       'Spkts','Dpkts','swin','dwin','smeansz','dmeansz',
    'Sjit','Djit','Sintpkt','Dintpkt','tcprtt','synack','ackdat','ct_srv_src','ct_srv_dst','ct_dst_ltm',
     'ct_src_ltm','ct_src_dport_ltm','ct_dst_sport_ltm','ct_dst_src_ltm')) ].values# atributos previsores

There are two columns I want to "merge":

one is called "Label" and has value 1 when it is attack, and 0 otherwise.

In the 'attack_cat' column I am only interested when its value is 'Dos' (and in this case the value of the 'Label' column is 1)

Goal:

Create a new column named "Class" that:

Take ONLY the values from the 'Label' column when the value of attack_cat is 'Dos' (and the value of 'Label' is 1)

(there are other values in attack_cat that do not interest me)

Take ALL values from the 'Label' column when it is 0 (no attack)

How to do?

  • Hello, it’s a bit confusing. An example of a dataframe you have and what you want as a result would help.

  • @Miguel: I edited the question!

1 answer

2


The way the question is formulated I understood that you need a single column that has Label values when attack_cat = "Dos" and Label values when Label = 0, and for that the solution would be something like:

df_downsampled['Classe'] = pd.concat([(df_downsampled.Label[df_downsampled.attack_cat[df_downsampled.Label == 1]]) , (df_downsampled.Label[df_downsampled.Label == 0])], ignore_index=True)

And to avoid the data that are Nan, call the column as:

df_downsample.Classe.dropna()

For a new dataset with the attack_cat filter = "Dos" you need to:

new_df_downsampled = pd.concat([df_downsampled[df_downsampled['attack_cat']=="DoS"],df_downsampled[df_downsampled.Label==0]])
  • Label values when Label = 1 E attack_cat ='Dos' and Label values when Label = 0

  • All in the same column? With a specific sort? @Eds

  • all in the same column. I would like to keep in the order that comes from the dataset, because the new column will be used in Machine Learning algorithm!

  • Try what I’ve edited now.

  • I will test! In this code will be maintained the dataset order?

  • I received the following error: "Valueerror: cannot reindex from a Duplicate Axis"

  • Can you try now? @Eds The order will be: attack_cat, then the Abels in the order they are in the dataset.

  • did not give error but the result is wrong: values are appearing where Label = 1 but attack_cat != 'Dos (these do not interest me)

  • I wanted to filter like this: df_downsampled[df_downsampled['attack_cat']="Dos"] or df_downsampled[df_downsampled['Label']=0]] but I don’t know how to do the above OR!

Show 5 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.