0
I did this function to calculate the entropy of Shannon in a string array, this vector is filled with Ips addresses, I would like to optimize this function since for cases where the input vector is too large, the calculation time is too high.
function [entropy] = ShannonEntropy(IPs)
%% ENTROPY
% Function calculates Shannon entropy for a string vector
%% Copy vector IPs
searchIP = IPs;
%% Determines the size of the vector
size = length(IPs);
%% Initialize vector
% P will be used to calculate probability
P = zeros(size, 1);
%% Finds occurrences
for i=1:size
for j=1:size
if (searchIP(i) == IPs(j) && IPs(j) ~= "-1")
P(i) = P(i)+1;
IPs(j) = "-1";
end
end
end
%% Calculate probabilities
P = P()/size;
%% Replace 0 with 1
% ie: log 0 = NaN, log 1 = 0
P(P == 0) = 1;
%% Initialize vector
partialEntropy = zeros(size, 1);
%% Calculates entropy
for i=1:size
partialEntropy(i) = (P(i)).*log(P(i));
end
%% Entropy sum
entropy = -sum(partialEntropy);
Hi questions should be asked in English (please edit your question), and you are wanting to calculate the entropy of each column ?
– ederwander
That I want to calculate the entropy of each column
– Pedro Canassa
Without the use sampling of each ip we cannot apply the Hartley formula I(U)= logᵦ r, which will determine the probability of each result.
– Augusto Vasques
to calculate only entropy you need nothing more than the data of each column ... if you also need to apply Hartley ai yes you will need to know the sampling rate ...
– ederwander