在R中求和多个条件

时间:2016-02-14 19:34:00

标签: r sum conditional

我试图将第4列(儿童),第5列(成人)和第6列(老年人)加起来,并按年份返回每个国家/地区的值,而不考虑第3列(性别)。通过各种论坛阅读我无法将这些结合起来:

http://domain:port/index

我能够按行对3列进行求和,并使用以下内容创建一个单独的列,但仍需要为每个国家/地区组合男性和女性行:

resources/static/index/index.html

2 个答案:

答案 0 :(得分:1)

如果要将总和添加到数据框中,请选择以下几个选项:

# with base R (1)
transform(dat, tuber.sum = ave(tuberculosiscases, country, year, FUN = sum))

# with base R (2)
dat$tuber.sum <- ave(dat$tuberculosiscases, dat$country, dat$year, FUN = sum))

# with the data.table package
library(data.table)
setDT(dat)[, tuber.sum:=sum(tuberculosiscases), by= .(country, year)]

# with the plyr package
library(plyr)
dat <- ddply(dat, .(country, year), transform, tuber.sum=sum(tuberculosiscases))

# with the dplyr package
library(dplyr)
dat <- dat %>% 
  group_by(country, year) %>% 
  mutate(tuber.sum=sum(tuberculosiscases))

所有给予:

> dat
       country year    sex child adult elderly tuberculosiscases tuber.sum
1: Afghanistan 1995   male    -1    -1      -1                -3        -3
2: Afghanistan 1996 female    -1    -1      -1                -3        -6
3: Afghanistan 1996   male    -1    -1      -1                -3        -6
4: Afghanistan 1997 female     5    96       1               102       128
5: Afghanistan 1997   male     0    26       0                26       128
6: Afghanistan 1998 female    45  1142      20              1207      1207

答案 1 :(得分:0)

如果我正确理解了您的问题并假设初始data.frame的名称是my_df,我会使用聚合:

 aggdata <-aggregate(my_df[,c("child", "adult", "elderly")], 
                     by=list(my_df$country,my_df$year), FUN=sum, na.rm=TRUE)