我有一个如下数据框我需要创建一个列总和。对于每一行,总和将来自行动月份。
Apr May Jun Jul Aug Sep Oct Nov Action On
4.0 2.0 3.0 2.0 1.5 1.0 0.5 4 July
3.0 4.0 1.0 7.0 2.0 3.0 1.0 2 August
3.0 3.0 1.0 0.5 0.5 1.0 1.0 6.0 September
1.0 1.0 0.5 0.3 0.3 0.5 0.5 2.0 October
0.5 1.0 0.4 0.1 0.1 0.3 0.3 3.0 July
0.4 3.0 0.0 0.2 0.2 0.1 0.1 9.0 September
1.3 5.0 0.3 0.4 0.4 0.2 0.2 7.0 November
2.2 7.0 0.6 1.0 0.6 0.4 0.4 1.2 July
请告诉我最好的代码。我创建了一个将月份转换为数字的列,并按如下方式使用for循环:
for(rowidx in 1: nrow(conshead)) {
startcol=conshead[rowidx,"b"]
conshead[rowidx,"sum"]=sum(conshead[rowidx,startcol:8], na.rm = TRUE)
}
仍然存在此错误
startcol出错:8:NA / NaN参数。
请分享更好的代码。
答案 0 :(得分:0)
这是你的事吗?
> m1 <- data.frame(Jul = c(1,4,6),
+ Aug = c(3,5,9),
+ ActionOn = c("July", "August", "July"))
>
> m1
Jul Aug ActionOn
1 1 3 July
2 4 5 August
3 6 9 July
>
> m1$sumofinterest <- colSums(m1[,match(substr(m1$ActionOn, 1, 3), colnames(m1))])
> m1
Jul Aug ActionOn sumofinterest
1 1 3 July 11
2 4 5 August 17
3 6 9 July 11
答案 1 :(得分:0)
您也可以尝试dplyr
library(tidyverse)
# reading your data
df <- read_table("Apr May Jun Jul Aug Sep Oct Nov Action On
4.0 2.0 3.0 2.0 1.5 1.0 0.5 4 July
3.0 4.0 1.0 7.0 2.0 3.0 1.0 2 August
3.0 3.0 1.0 0.5 0.5 1.0 1.0 6.0 September
1.0 1.0 0.5 0.3 0.3 0.5 0.5 2.0 October
0.5 1.0 0.4 0.1 0.1 0.3 0.3 3.0 July
0.4 3.0 0.0 0.2 0.2 0.1 0.1 9.0 September
1.3 5.0 0.3 0.4 0.4 0.2 0.2 7.0 November
2.2 7.0 0.6 1.0 0.6 0.4 0.4 1.2 July")
代码是:
df %>%
select_if(is.numeric) %>%
mutate(SUM=colSums(.)) %>%
bind_cols(df %>% select_if(is.character))
输出:
# A tibble: 8 x 10
# Apr May Jun Jul Aug Sep Oct Nov SUM `Action On`
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
# 1 4.00 2. 3.00 2.00 1.50 1.00 0.500 4.00 15.4 July
# 2 3.00 4. 1.00 7.00 2.00 3.00 1.00 2.00 26.0 August
# 3 3.00 3. 1.00 0.500 0.500 1.00 1.00 6.00 6.80 September
# 4 1.00 1. 0.500 0.300 0.300 0.500 0.500 2.00 11.5 October
# 5 0.500 1. 0.400 0.100 0.100 0.300 0.300 3.00 5.60 July
# 6 0.400 3. 0. 0.200 0.200 0.100 0.100 9.00 6.50 September
# 7 1.30 5. 0.300 0.400 0.400 0.200 0.200 7.00 4.00 November
# 8 2.20 7. 0.600 1.00 0.600 0.400 0.400 1.20 34.2 July
按月分组将是:
df %>%
select_if(is.numeric) %>%
mutate(SUM=colSums(.)) %>%
bind_cols(df %>% select_if(is.character)) %>%
group_by(`Action On`) %>%
summarise(SUM_per_month=sum(SUM))
输出:
# A tibble: 5 x 2
# `Action On` SUM_per_month
# <chr> <dbl>
# 1 August 26.0
# 2 July 55.2
# 3 November 4.00
# 4 October 11.5
# 5 September 13.3