Generation of samples in R

Asked

Viewed 94 times

2

I have a dataset with 200 observations. I generated a sample with no reset of size 100 with the following commands:

library(car)
require(car)
(amostra1= some(dados,n=100,replace=F))
write.xlsx(amostra1,"C:/Users/../Desktop/amostra1.xlsx")

My interest is also in the non-sampled remarks.
The question is: What (is) command(s) in R should I use to get the observations nay sampled?

  • 1

    library and require is redundant. Keep the first, remove the second.

1 answer

2

Considering the way you’re choosing amostra1, the natural way to get the other data is or with %in% and which or with match.
First I’ll create a vector dados.

library(car)

set.seed(7437)    # Torna os resultados reprodutíveis
dados <- rnorm(200)

Now the choice of others.

amostra1 <- some(dados, n = 100, replace = FALSE)

i1 <- which(!dados %in% amostra1)
dados[i1]

i2 <- match(amostra1, dados)
dados[-i2]

identical(dados[i1], dados[-i2])
#[1] TRUE

If instead of sampling directly dados sample indexes vector dados, simply deny this index to obtain the others.

j <- some(seq_along(dados), n = 100, replace = FALSE)

amostra1 <- dados[j]
outros1 <- dados[-j]
  • Thanks for the answer! But your suggestions work well for simulated data. How would your suggestions be adapted to a dataset as follows: data <- read.xlsx("training.xlsx", sheetName = "Plan1") Names(data) sample1= with(data,sample(100,replace=F)) sample1 write.xlsx(sample1,"C:/Users/Juliana/Desktop/sample1.xlsx")

  • @Maicon In this case you have a data.frame. So you have to adapt the code for a two-dimensional object: dados[j, ] and dados[-j, ]. And the same for the other examples, with the indexes i1 and i2.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.