Well, in principle your code is correct, it should subset the data, what may have occurred is some other problem that would only be possible to verify with the specific case.
Showing in a sample data frame:
set.seed(1)
df <- data.frame(valor= rnorm(100), categoria = rep(c("AB", "AC"), 50), stringsAsFactors=FALSE)
dr <- subset(df, df[2]=="AC")
See that dr
has only lines whose second column is "AC":
unique(dr[2])
categoria
2 AC
head(dr)
valor categoria
2 0.1836433 AC
4 1.5952808 AC
6 -0.8204684 AC
8 0.7383247 AC
10 -0.3053884 AC
12 0.3898432 AC
There are several other ways to filter a data frame. One of them would be to use the operator [
of R. Example:
dr <- df[df[2]=="AC", ]
or
dr <- df[df$categoria=="AC", ]
There are also specific packages for data handling. An excellent package for this is the dplyr
, because it is quite fast and has an intuitive syntax (for example, the filter command is called "filter").
In the dplyr
would look like this:
library(dplyr)
dr <- df%>%filter(categoria=="AC")
If you will work a lot with databases, it is worth taking a look.
At first your code is ok. What error do you receive? Also put a sample of your data.
– Carlos Cinelli