loop in correlation matrix in R

Question

loop in correlation matrix in R

Asked 9 years, 4 months ago

Viewed 878 times

5

I have tried to learn about loops and functions in R. So I disposed to the following situation: I have a matching correlation matrix:

dados<-matrix(rnorm(100),5,5)
colnames(dados)<-c('A','B','C','D','E')
rownames(dados)<-c('A','B','C','D','E')
dados
cor<-cor(dados)

I want to use loop and if conditions to get only combinations of variables with values > 0.5 of the color object. However, I can’t find a way to peer through the rows and columns of my Matrix.

I’ve been trying the following code:

for (i in 1:nrow(cor)){
  for (j in 1:ncol(cor)){
    # comando para comparar par a par
    if (cor[i,j]>0.5){
      #retornar um nova matrix com variável e valor > 0.5
    }
  }
}

Can someone help me solve these commands?

2 answers

0

An easy way is to use the function melt package reshape2, as explained in that question (in English).

code:

install.packages("reshape2")             # Caso você não tenha instalado o pacote ainda
library(reshape2)

set.seed(0101)
dados <- matrix(rnorm(100),5,5)
colnames(dados) <- c('A','B','C','D','E')
rownames(dados) <- c('A','B','C','D','E')
CorMatrix <- cor(dados)                    # Tente usar nomes de variáveis que não sejam 
                                           # também nome de função

CM <- corMatrix                            # Copiando sua matriz
CM[lower.tri(CM, diag = TRUE)] <- NA       # Removendo as correlações repetidas e a diagonal
rownames(resultados) <- NULL               # (não necessário) limpando os nomes das linhas
resultado <- subset(                      # Filtra as linhas que possuem o valor de correlação
    melt(CM, na.rm=T),                     # maior do que você queira (0.5 no caso)
    value > 0.5)

outworking:

>resultado
   Var1  Var2   value
1   C     D    0.5215197

A hint referring to your code, do not use the name of a function as a variable name, as was the case of Matrix cor.

Browser other questions tagged r

You are not signed in. Login or sign up in order to post.

by Carlos Cinelli • **16,826** points · Answer 1 · 2016-03-06T22:52:01+00:00

Assuming you want to use loops (to train or for another reason, because in this case you don’t need to use loops), you can save the results in a list.

Recreating your data (with set.seed() for reproducibility):

set.seed(10)
dados <- matrix(rnorm(100),5,5)
colnames(dados) <- c('A','B','C','D','E')
rownames(dados) <- c('A','B','C','D','E')
cor <- cor(dados)

Traversing the loop and saving results in a list:

# lista para armazenar resultado
resultados <- list()

for (i in 1:nrow(cor)){
  for (j in 1:ncol(cor)){
    if (cor[i,j]>0.5){
      # armazena no primeiro nível a linha e no segundo nível a coluna
      resultados[[rownames(cor)[i]]][[colnames(cor)[j]]] <- cor[i,j]
    }
  }
}

resultados
$A
        A         C 
1.0000000 0.7764006 

$B
       B        E 
1.000000 0.912793 

$C
        A         C 
0.7764006 1.0000000 

$D
D 
1 

$E
       B        E 
0.912793 1.000000

With the list at hand you can arrange the data however you want. For example, the simplest way to transform into a vector is with unlist().

unlist(resultados)

     A.A       A.C       B.B       B.E       C.A       C.C       D.D       E.B       E.E 
1.0000000 0.7764006 1.0000000 0.9127930 0.7764006 1.0000000 1.0000000 0.9127930 1.0000000

But remember that you do not need to use loops in this case. For example, a way to get the same result above would be:

indices <- which(cor > 0.5, arr.ind = TRUE)
res <- setNames(cor[indices], paste(colnames(cor)[indices[,2]], rownames(cor)[indices[,1]], sep = "."))
res
     A.A       A.C       B.B       B.E       C.A       C.C       D.D       E.B       E.E 
1.0000000 0.7764006 1.0000000 0.9127930 0.7764006 1.0000000 1.0000000 0.9127930 1.0000000