0
Good afternoon guys!!! I have a little problem to perform a work of a college subject. I’m picking up a ready dataset from another article that was held.
Dataset goes something like this:
3,24.3,389693,21,23,tcp,1540,-------,4,11339,16091,24780100,Switch1,Router,35.529786,35.529786,35.539909,0,328.240918,505490,1540,0.236321,0,35.519662,35.550032,1,50.02192,Normal
15,24.15,201196,23,24,tcp,1540,-------,16,6274,16092,24781700,Router,server1,20.176725,20.176725,20.186848,0,328.205808,505437,1540,0.236337,0,20.156478,20.186848,1,50.030211,Normal
24.15,15,61905,23,22,ack,55,-------,16,1930,16092,885060,Router,Switch2,7.049955,7.049955,7.059958,0,328.206042,18051.3,55,0.008441,0,7.039952,7.069962,1.030045,50.060221,UDP-Flood
24.9,9,443135,23,21,ack,55,-------,10,12670,16085,884675,Router,Switch1,39.62797,39.62797,39.637973,0,328.064183,18043.5,55,0.008437,0,39.617967,39.647976,1.030058,50.060098,Normal
24.8,8,157335,23,21,ack,55,-------,9,4901,16088,884840,Router,Switch1,16.039806,16.039806,16.04981,0,328.113525,18046.2,55,0.008438,0,16.029803,16.059813,1.030054,50.061864,Normal
24.1,1,219350,21,1,ack,55,-------,2,6837,16091,885005,Switch1,clien-1,21.885768,21.885768,21.895771,0,328.297902,18056.4,55,0.00844,0,21.865762,21.895771,1.030016,50.043427,Normal
24.13,13,480053,24,23,ack,55,-------,14,13609,16103,885665,server1,Router,42.45032,42.45032,42.460323,0,328.460278,18065.3,55,0.008446,0,42.45032,42.48033,1.030032,50.055747,Normal
It’s a dataset they made available about Ddos attacks. I will from this dataset perform the application of supervised classifiers such as Naivebayes, Randomforest and Multi Layer Perceptron (Artificial Intelligence).
The language I’m using is Python (Required) and I’m using Numpy to get the dataset. This function looks like this:
np.set_printoptions(formatter={'float': lambda x: "{0:0.10f}".format(x)})
X = np.loadtxt("datasetTrabalho.data", delimiter=",")
But every time I try to do something, it makes mistakes like that:
File "trabalho.py", line 190, in <module>
main()
File "trabalho.py", line 98, in main
X = np.loadtxt("testeTrabalho.data", delimiter=",") # pega o dataset
File "/home/arthur/.local/lib/python3.5/site-packages/numpy/lib/npyio.py", line 1101, in loadtxt
for x in read_data(_loadtxt_chunksize):
File "/home/arthur/.local/lib/python3.5/site-packages/numpy/lib/npyio.py", line 1028, in read_data
items = [conv(val) for (conv, val) in zip(converters, vals)]
File "/home/arthur/.local/lib/python3.5/site-packages/numpy/lib/npyio.py", line 1028, in <listcomp>
items = [conv(val) for (conv, val) in zip(converters, vals)]
File "/home/arthur/.local/lib/python3.5/site-packages/numpy/lib/npyio.py", line 746, in floatconv
return float(x)
ValueError: could not convert string to float: 'tcp'
I need a means help to change these Dataset Strings values to Integer values, to use the appropriate classifiers for the job. Interesting if someone also has another library to solve this problem. I will be grateful for the help.
Solved, thank you very much!!!
– Arthur Abitante
@Arthurabitante how good!! :)
– Leonardo Bohac