Name of member Allowance Type Expenditure Type Date Amount, £
Adam Afriyie Office running costs (IEP/AOE) Incidentals 07/03/2009 111.09
Adam Afriyie Office running costs (IEP/AOE) Incidentals 11/05/2009 111.09
Adam Afriyie Office running costs (IEP/AOE) Incidentals 11/05/2009 51.75
Adam Holloway Office running costs (IEP/AOE) Incidentals 10/01/2009 35
Adam Holloway Office running costs (IEP/AOE) Incidentals 10/01/2009 413.23
Adam Holloway Office running costs (IEP/AOE) Incidentals 10/01/2009 9.55
Adam Holloway Office running costs (IEP/AOE IT equipment 07/03/2009 890.01
Adam Holloway Communications Expenditure Publications 12/04/2009 1774
Adam Holloway Office running costs (IEP/AOE) Incidentals 12/08/2009 1.1
Adam Holloway Office running costs (IEP/AOE Incidentals 12/08/2009 64.31
Adam Holloway Office running costs (IEP/AOE) Incidentals 12/08/2009 64.31
我是R的新手,也是编程新手。这是MP在特定时间段内的费用的子集。我想对每个MP的费用进行小计,并使用其他帖子中的代码
> aggregate(cbind(bsent, breturn, tsent, treturn, csales) ~ yname, data = foo,
+ FUN = sum)
并根据我自己的情况编辑。
我的代码:
expenses2 <- aggregate(cbind(Amount..Â.) ~ Name.of.member, data = expenses, FUN = sum)
现在虽然此代码确实进行了某种聚合,但这些数字并不匹配。例如,人们可以计算出Adam Afriyie的费用是273.93英镑,但是这段代码给出了12697的结果。我不知道这个数字代表什么。有人可以帮我,告诉我我做错了什么?
提前谢谢
答案 0 :(得分:2)
仅使用您的名称列和上一个金额列:
df <- data.frame(name = c(rep("Adam Afriyie", 3), rep("Adam Holloway", 8)),
amount = c(111.09, 111.09, 51.75, 35,
413.23, 9.55, 890.01, 1774, 1.1, 64.31, 64.31)
)
版本1
aggregate(df$amount, by = list(name = df$name), FUN = "sum")
第2版
aggregate(amount ~ name, data = df, FUN = "sum")
输出:
1 Adam Afriyie 273.93
2 Adam Holloway 3251.51
答案 1 :(得分:2)
我把那个文字拉进编辑器。然后制作有效的标题名称并放回显然已被空格替换的标签并读入R获取此对象:
MPexp <- structure(list(Name_of_member = c("Adam Afriyie", "Adam Afriyie",
"Adam Afriyie", "Adam Holloway", "Adam Holloway", "Adam Holloway",
"Adam Holloway", "Adam Holloway", "Adam Holloway", "Adam Holloway",
"Adam Holloway"), Allowance_Type = c("Office running costs (IEP/AOE)",
"Office running costs (IEP/AOE)", "Office running costs (IEP/AOE)",
" Office running costs (IEP/AOE)", " Office running costs (IEP/AOE)",
" Office running costs (IEP/AOE)", " Office running costs (IEP/AOE",
" Communications Expenditure", " Office running costs (IEP/AOE)",
" Office running costs (IEP/AOE", " Office running costs (IEP/AOE)"
), Expenditure_Tyoe = c("Incidentals", "Incidentals", "Incidentals",
"Incidentals", "Incidentals", "Incidentals", "IT equipment",
"Publications", "Incidentals", "Incidentals", "Incidentals"),
Date = c("07/03/09", "11/05/09", "11/05/09", "10/01/09",
"10/01/09", "10/01/09", "07/03/09", "12/04/09", "12/08/09",
"12/08/09", "12/08/09"), Amount = c(111.09, 111.09, 51.75,
35, 413.23, 9.55, 890.01, 1774, 1.1, 64.31, 64.31)), .Names = c("Name_of_member",
"Allowance_Type", "Expenditure_Tyoe", "Date", "Amount"),
class = "data.frame", row.names = c(NA,
-11L))
现在这应该产生聚合的预期结果:
> aggregate(MPexp$Amount, MPexp["Name_of_member"], sum)
Name_of_member x
1 Adam Afriyie 273.93
2 Adam Holloway 3251.51
再次阅读你的问题让我意识到你使用的是aggregate.formula,所以这也适用于那些数据:
> aggregate(Amount ~ Name_of_member, data=MPexp, FUN=sum)
Name_of_member Amount
1 Adam Afriyie 273.93
2 Adam Holloway 3251.51
答案 2 :(得分:1)
使用plyr
library(plyr)
#Using data from mropa's answer
> ddply(df, .(name), summarise, sum = sum(amount))
name sum
1 Adam Afriyie 273.93
2 Adam Holloway 3251.51