5
I can select the k greatest results from a table in R. For example, if k equals 5, I get the following result:
library(dplyr)
library(ggplot2)
top_n(mpg, 5, wt=displ)
# A tibble: 5 × 11
manufacturer model displ year cyl trans drv cty
<chr> <chr> <dbl> <int> <int> <chr> <chr> <int>
1 chevrolet corvette 6.2 2008 8 manual(m6) r 16
2 chevrolet corvette 6.2 2008 8 auto(s6) r 15
3 chevrolet corvette 7.0 2008 8 manual(m6) r 15
4 chevrolet k1500 tahoe 4wd 6.5 1999 8 auto(l4) 4 14
5 jeep grand cherokee 4wd 6.1 2008 8 auto(l5) 4 11
# ... with 3 more variables: hwy <int>, fl <chr>, class <chr>
However, my results are not sorted according to the column displ
. I would like the table lines to be in descending order, as follows:
top_n(mpg, 5, wt=displ)[order(top_n(mpg, 5, wt=displ)$displ, decreasing=TRUE), ]
# A tibble: 5 × 11
manufacturer model displ year cyl trans drv cty
<chr> <chr> <dbl> <int> <int> <chr> <chr> <int>
1 chevrolet corvette 7.0 2008 8 manual(m6) r 15
2 chevrolet k1500 tahoe 4wd 6.5 1999 8 auto(l4) 4 14
3 chevrolet corvette 6.2 2008 8 manual(m6) r 16
4 chevrolet corvette 6.2 2008 8 auto(s6) r 15
5 jeep grand cherokee 4wd 6.1 2008 8 auto(l5) 4 11
# ... with 3 more variables: hwy <int>, fl <chr>, class <chr>
The code works, but I’m finding it ugly. How could I simplify it to get the same result? Note that I use the command top_n(mpg, 5, wt=displ)
twice, which I imagine can slow my code down if the table is too big. Is there any way to get this same result more elegantly?
About the "ugly" part: use the syntax of
dplyr
with%>%
would not help?– Tomás Barcellos