How to optimize Shannon entropy calculation function?

Asked

Viewed 81 times

0

I did this function to calculate the entropy of Shannon in a string array, this vector is filled with Ips addresses, I would like to optimize this function since for cases where the input vector is too large, the calculation time is too high.

function [entropy] = ShannonEntropy(IPs)
%% ENTROPY
 % Function calculates Shannon entropy for a string vector

 %% Copy vector IPs
 searchIP = IPs; 

%% Determines the size of the vector
size = length(IPs);

%% Initialize vector 
 % P will be used to calculate probability
P = zeros(size, 1);

%% Finds occurrences
for i=1:size
 for j=1:size
  if (searchIP(i) == IPs(j) && IPs(j) ~= "-1") 
   P(i) = P(i)+1;
   IPs(j) = "-1";
  end
 end
end

%% Calculate probabilities
P = P()/size;

%% Replace 0 with 1
 % ie: log 0 = NaN, log 1 = 0
P(P == 0) = 1;

%% Initialize vector
partialEntropy = zeros(size, 1);

%% Calculates entropy
for i=1:size
 partialEntropy(i) = (P(i)).*log(P(i));
end

%% Entropy sum
entropy = -sum(partialEntropy);
  • Hi questions should be asked in English (please edit your question), and you are wanting to calculate the entropy of each column ?

  • That I want to calculate the entropy of each column

  • Without the use sampling of each ip we cannot apply the Hartley formula I(U)= logᵦ r, which will determine the probability of each result.

  • to calculate only entropy you need nothing more than the data of each column ... if you also need to apply Hartley ai yes you will need to know the sampling rate ...

No answers

Browser other questions tagged

You are not signed in. Login or sign up in order to post.