有人知道如何在特定条件下对所选列的行进行求和吗?
例如,我有五列,其行按2000年到2008年的年份排序。我只需要总结那些"年< 2006"并添加一个新的总栏目(包括NA' s,因为其他年份没有涉及)。
我认为group_by不会工作,因为我不需要按组加总
我的数据是
A <- c(1,2,3,4,5,6,7,8,9,10)
B <- c(1,2,3,4,5,6,7,8,9,10)
Year <- c(2000, 2001, 2000, 2001, 2003, 2004, 2005, 2006, 2007, 2008)
dta <- as.data.frame(A,B, Year)
我很想获得像
这样的东西TotalColumn Year
2 2000
4 2001
6 2000
8 2001
10 2003
12 2004
14 2005
NA 2006
NA 2007
NA 2008
答案 0 :(得分:2)
ifelse
可能是个不错的选择:
A <- c(1,2,3,4,5,6,7,8,9,10)
B <- c(1,2,3,4,5,6,7,8,9,10)
Year <- c(2000, 2001, 2000, 2001, 2003, 2004, 2005, 2006, 2007, 2008)
dta <- as.data.frame(cbind(rep(NA, each = length(A)), Year))
colnames(dta) <- c("TotalColumn", "Year")
dta$TotalColumn <- ifelse(dta$Year < 2006, A + B, NA)
dta
TotalColumn Year
1 2 2000
2 4 2001
3 6 2000
4 8 2001
5 10 2003
6 12 2004
7 14 2005
8 NA 2006
9 NA 2007
10 NA 2008
答案 1 :(得分:2)
使用data.table(根据Frank的评论更新)
library(data.table)
A <- c(1,2,3,4,5,6,7,8,9,10)
B <- c(1,2,3,4,5,6,7,8,9,10)
Year <- c(2000, 2001, 2000, 2001, 2003, 2004, 2005, 2006, 2007, 2008)
dta <- data.table(A, B, Year)
dta[Year < 2006, TotalColumn := A+B][, .(TotalColumn, Year)]
或者您可以使用Frank的建议通过用
替换最后一行来编辑dtadta[Year < 2006, TotalColumn := A+B][, c("A", "B") := NULL]
结果:
TotalColumn Year
1: 2 2000
2: 4 2001
3: 6 2000
4: 8 2001
5: 10 2003
6: 12 2004
7: 14 2005
8: NA 2006
9: NA 2007
10: NA 2008
答案 2 :(得分:0)
尝试在purrr中使用by_row函数
A <- c(1,2,3,4,5,6,7,8,9,10)
B <- c(1,2,3,4,5,6,7,8,9,10)
Year <- c(2000, 2001, 2000, 2001, 2003, 2004, 2005, 2006, 2007, 2008)
dta <- data.frame(A,B, Year)
Total_col <- dta %>%
filter(Year < 2006) %>%
select(A,B) %>%
purrr::by_row(sum, .collate = "cols", .to = "Total_Col")
yr_total_Col <- dta %>% filter(Year < 2006) %>% select(Year)
Total_col <- cbind(Total_col,yr_total_Col)
这应该给你。
dta.x <- full_join(dta,Total_col) %>% select(Year,Total_Col)
# Year Total_Col
# 1 2000 2
# 2 2001 4
# 3 2000 6
# 4 2001 8
# 5 2003 10
# 6 2004 12
# 7 2005 14
# 8 2006 NA
# 9 2007 NA
# 10 2008 NA