我试图将第4列(儿童),第5列(成人)和第6列(老年人)加起来,并按年份返回每个国家/地区的值,而不考虑第3列(性别)。通过各种论坛阅读我无法将这些结合起来:
http://domain:port/index
我能够按行对3列进行求和,并使用以下内容创建一个单独的列,但仍需要为每个国家/地区组合男性和女性行:
resources/static/index/index.html
答案 0 :(得分:1)
如果要将总和添加到数据框中,请选择以下几个选项:
# with base R (1)
transform(dat, tuber.sum = ave(tuberculosiscases, country, year, FUN = sum))
# with base R (2)
dat$tuber.sum <- ave(dat$tuberculosiscases, dat$country, dat$year, FUN = sum))
# with the data.table package
library(data.table)
setDT(dat)[, tuber.sum:=sum(tuberculosiscases), by= .(country, year)]
# with the plyr package
library(plyr)
dat <- ddply(dat, .(country, year), transform, tuber.sum=sum(tuberculosiscases))
# with the dplyr package
library(dplyr)
dat <- dat %>%
group_by(country, year) %>%
mutate(tuber.sum=sum(tuberculosiscases))
所有给予:
> dat
country year sex child adult elderly tuberculosiscases tuber.sum
1: Afghanistan 1995 male -1 -1 -1 -3 -3
2: Afghanistan 1996 female -1 -1 -1 -3 -6
3: Afghanistan 1996 male -1 -1 -1 -3 -6
4: Afghanistan 1997 female 5 96 1 102 128
5: Afghanistan 1997 male 0 26 0 26 128
6: Afghanistan 1998 female 45 1142 20 1207 1207
答案 1 :(得分:0)
如果我正确理解了您的问题并假设初始data.frame的名称是my_df,我会使用聚合:
aggdata <-aggregate(my_df[,c("child", "adult", "elderly")],
by=list(my_df$country,my_df$year), FUN=sum, na.rm=TRUE)