Is there a more efficient way to create an array from another array dynamically, filtering the contents of the first one?

Asked

Viewed 1,932 times

15

I have an array of values that can include several numpy.Nan:

import numpy as np
a = np.array ( [1, 2, np.nan, 4] )

And I want to iterate over your items to create a new array without np.Nan.

The way I know to create arrays dynamically is to create an array of zeros (np.zeros()) and fill it with content of interest a posteriori.

The way I do, I have to iterate the array a twice: once to count how many np.nans I will find and reduce that number of the dimension of the array b; and the second iteration to popular the array b:

# Contando quantos nan's
count = 0
for e in a:
if np.isnan(e):
    count += 1

# criando o array vazio do tamanho certo
size = a.shape[0]
b = np.zeros( (size - count, ) )

# populando o array com o conteúdo pertinente
ind = 0
for e in a:
    if not np.isnan(e):
        b[ind] = e
        ind += 1

I imagine it is also possible to do this by converting a to the list (since it is one-dimensional) and filter this list to the list b then convert it to array.

But there is a more efficient way to do this only with arrays?

3 answers

16


You can filter the values using an expression in the index:

import numpy as np
a = np.array ( [1, 2, np.nan, 4] )

# Filtra NaN
filtrado = a[~np.isnan(a)]

The expression np.isnan(a) returns a vector of booleans indicating, for each position of the array a, whether or not he is NaN. The ~ negates this vector. So you use the mechanism of boolean indexing to select only records whose value ~np.isnan(a) be it True.

  • 1

    An alternative would be filter(lambda n: ~np.isnan(n), a), but I believe that the boolean indexing suggested by @rodrigorgs is more efficient. :)

2

I believe the default solution to your problem is to use the function filter whose syntax is:

filter(função_booleana, valor_interavel)

For each value in valor_interavel, the function performs função_booleana with the value, filtered it from the result if the boolean function returns false. You can use in conjunction with isnan thus:

filter(np.isnan, seu_array)

Best of all, the solution is compact and clear. Note that you do not need to import any module to get the filter function, since it is implemented by the Python interpreter.

0

You can use the set from python it removes repeated items from a list.

>>> a = [1,2,3]
>>> b = a + [4,5,6,3,2,1]
>>> print b
[1, 2, 3, 4, 5, 6, 3, 2, 1]
>>> print set(b)
set([1, 2, 3, 4, 5, 6])

I think that solves your problem!

  • in reality does not resolve. I want to remove items from the list A from the contents of the list B, not make a list only. The answer actually answers the question. Thank you, anyway. ;)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.