Tuesday 24 January 2017

r - How do I compare a particular group mean to each separate group?



I have a large dataset and there are many different columns that I am trying to group the data by. I am trying to create a new column using dplyr and mutate which is the mean for each individual group. I then want to see the difference between these means and the mean of just one single category.



This question can pertain to the mtcars dataset. How would I group the mtcars data by "cyl" & "gear" and then take the mean of "mpg" for each group. I then want to see the difference of every group's mean of "mpg" compared to specifically all the cars with "gear"==5, but have variable "cyl".




I apologize if I'm asking the same question as others have, but I have not been able to find this specific question.



df <- mtcars
df2 <- df %>% group_by(cyl, gear) %>% mutate(mean_mpg = mean(mpg))

Answer



df2 <- df %>%
group_by(cyl, gear) %>%
summarise(mean_mpg = mean(mpg)) %>%

mutate(comparison_mpg = mean_mpg[which(gear == 5)],
mpg_diff = mean_mpg - comparison_mpg)


Result



# A tibble: 8 x 5
# Groups: cyl [3]
cyl gear mean_mpg comparison_mpg mpg_diff


1 4. 3. 21.5 28.2 -6.70
2 4. 4. 26.9 28.2 -1.27
3 4. 5. 28.2 28.2 0.
4 6. 3. 19.8 19.7 0.0500
5 6. 4. 19.8 19.7 0.0500
6 6. 5. 19.7 19.7 0.
7 8. 3. 15.0 15.4 -0.350
8 8. 5. 15.4 15.4 0.

No comments:

Post a Comment

c++ - Does curly brackets matter for empty constructor?

Those brackets declare an empty, inline constructor. In that case, with them, the constructor does exist, it merely does nothing more than t...