我正在尝试根据列 A 中的名称对 2 列(收入和支出)进行平均,如下所示(以找到每列的平均年值)。我想我遇到了语法错误,但不确定我哪里出错了,我尝试了一些不同的变体,但没有成功。
这是我的表格片段;
GroupName Year Age Size Income Expenditure
yellow 2008 35 2.7 46704 42394
red 2008 29 2.6 23404 25270
yellow 2010 40 2.3 16747 21145
red 2012 34 2.8 31308 29855
blue 2008 31 3.0 49106 46561
green 2008 35 2.6 61674 52776
这是我的代码;
NewGroupfactsDS <- NewGroupfactsDS %>%
group_by(GroupName) %>% summarize(AvgExpenditure = mean(Expenditure), summarize(AvgIncome = mean(Income))
提前感谢您的任何帮助:)
答案 0 :(得分:5)
这里有两种方法,第一种使用 across
,第二种更正问题代码中的错误。
library(dplyr)
NewGroupfactsDS <- NewGroupfactsDS %>%
group_by(GroupName) %>%
summarize(across(c(Expenditure, Income), mean))
NewGroupfactsDS <- NewGroupfactsDS %>%
group_by(GroupName) %>%
summarize(AvgExpenditure = mean(Expenditure),
AvgIncome = mean(Income))
答案 1 :(得分:4)
只需删除第二个 summarize
。并考虑 Rui Barradas 对 across
NewGroupfactsDS <- NewGroupfactsDS %>%
group_by(GroupName) %>%
summarize(AvgExpenditure = mean(Expenditure), AvgIncome = mean(Income))
输出:
GroupName AvgExpenditure AvgIncome
<chr> <dbl> <dbl>
1 blue 46561 49106
2 green 52776 61674
3 red 27562. 27356
4 yellow 31770. 31726.
答案 2 :(得分:0)
基本 R 答案
aggregate(DF[,4:5], list("GroupName" = DF$GroupName), mean)