How to return the most prevalent category associated with a group?

Question

How to return the most prevalent category associated with a group?

Asked 6 years, 10 months ago

Viewed 122 times

1

I have a database, in which the variable a is the group variable and b a variable with some categories. My goal is, within each group of a, return what else appears in b.

Consider the dput:

dataset=structure(list(a = c(500, 500, 500, 400, 400, 400, 300, 300, 
300), b = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L), .Label = c("a", 
"b"), class = "factor")), class = "data.frame", row.names = c(NA, 
-9L))

Desired result:

In addition, it would be useful to return the counts and percentages of this predominance. Something like:

a    b    count    percent
500  a    2        .66 #66%
400  b    2        .66 #66% 
500  a    2        .66 #66%

1 answer

Browser other questions tagged r

You are not signed in. Login or sign up in order to post.

by Rafael Cunha • **4,954** points · Answer 1 · 2018-10-17T16:20:16+00:00

Using the package dplyr:

library(dplyr)
dataset %>% 
  group_by(a, b) %>% 
  summarise(count = n()) %>% 
  mutate(percent = count/sum(count)) %>% 
  filter(count == max(count))