Select column numpy no for

Asked

Viewed 328 times

2

I’ll be very brief. I’m just wondering how to put in for when the values of the first column of the matrix is less than zero, the values of the second column will be a.

from scipy import stats
from scipy.stats import norm
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

#Exercise 1:

x = np.array([7.3,8.2,6.0,7.7,8.0,6.1,5.6,5.3,5.9,5.8,5.8,7.1,5.1,8.0,7.6,8.3,4.9,6.5])
y = np.array([7.5,6.2,5.7,4.4,4.7,5.8,5.0,6.0,6.5,5.8,4.5,5.1,5.5,6.0,5.8,5.8,5.7,7.5])
m = 100000
matrix = np.zeros((m,2))
matrix

for i in range(0,m):
    matrix[i,0] = np.mean(np.random.normal(x)) - np.mean(np.random.normal(y))
    
    if matrix[i,0]<0:
        matrix[0,i] = 1
    
dados = matrix[0: , :1]
sns.distplot(dados)
plt.title("Histograma da Distribuição Amostral da diferença das Médias")
plt.xlabel("Respostas")
plt.ylabel("Frequência")
plt.show()

2 answers

0

To select an element in a matrix in the format of numpy.array is written:

matrix[n_linha, n_coluna]

Whereas n_linha is a int representing the line number and n_coluna is also a int representing the column number.

Obs: in python, the count starts at 0

Ex:

  • matrix[0,0] is the first row and first column element;
  • matrix[19,1] is the element of 20° row and second column;

Therefore, to select in a matrix the element of the second column of a row i, is written:

matriz[i,1]

or you can also write as

matrix[i][1]

Then, to put the number 1 in the second column whenever the element in the first column is less than zero, write

for i in range(0,m):

    if matrix[i,0]<0:
        matrix[i,1] = 1

0

Good morning! I propose the solution below that eliminates the need for the second loop for and the use of if, because despite the use of for function, the functions of numpy for handling matrices and vectors are usually more efficient.

Solution using the function np.where: This function iterates the desired column (0) and generates an index of the rows where the condition matrix[:,0]<0 is met. The generated index is being used to iterate each row corresponding to the condition met and replace the value of column 1 with value 1.

matrix[np.where(matrix[:,0]<0), 1] = 1

New version of your code:

x = np.array([7.3,8.2,6.0,7.7,8.0,6.1,5.6,5.3,5.9,5.8,5.8,7.1,5.1,8.0,7.6,8.3,4.9,6.5])
y = np.array([7.5,6.2,5.7,4.4,4.7,5.8,5.0,6.0,6.5,5.8,4.5,5.1,5.5,6.0,5.8,5.8,5.7,7.5])

m = 100000
matrix = np.zeros((m,2))
matrix

##SOLUÇÃO PROPOSTA - PARTE 1:
#Removi o segundo loop for e substitui pela linha comentada abaixo
for i in range(0,m):
    matrix[i,0] = np.mean(np.random.normal(x)) - np.mean(np.random.normal(y))

##SOLUÇÃO PROPOSTA - PARTE 2:
#Esta linha soluciona sua dúvida sem necessidade de for e if
matrix[np.where(matrix[:,0] < 0), 1] = 1

dados = matrix[0: , :1]
sns.distplot(dados)
plt.title("Histograma da Distribuição Amostral da diferença das Médias")
plt.xlabel("Respostas")
plt.ylabel("Frequência")
plt.show()

Browser other questions tagged

You are not signed in. Login or sign up in order to post.