我有一个如下所示的数据框。
df <- data.frame(mnth = c("jan", "feb", "feb", "mar", "mar",
"mar", "apr", "apr", "apr", "apr",
"may", "may", "may", "may", "may"),
n = c(1, 1, 2, 1, 2, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5),
value = c(5, 1, 3, 2, 8, 0, 6, 0, 2, 7, 2, 1, 4, 2, 6))
我想在value
字段中为n
字段的每个值添加相应的数字。
在这种情况下,答案应该是:
16,12,6,9,6
16 = 5 + 1 + 2 + 6 + 2 # all rows where 'n' = 1
12 = 3 + 8 + 0 + 1 # all rows where 'n' = 2
6 = 0 + 2 + 4 # all rows where 'n' = 3
9 = 7 + 2 # all rows where 'n' = 4
6 # all rows where 'n' = 5
如何编写for循环以在R中添加数字?
答案 0 :(得分:0)
我不会为此使用for
循环,因为这可以通过单行(也许是稍微神秘的但仍然是)来完成:
df$N <- c(16, 12, 6, 9, 6)[df$mth]
在此之前,您需要谨慎地重新订购mth
因素:
df$mth <- factor(df$mth, levels=c("jan", "feb", "mar", "apr", "may"))
结果:
> df
mth n value N
1 jan 1 5 16
2 feb 1 1 12
3 feb 2 3 12
4 mar 1 2 6
5 mar 2 8 6
6 mar 3 0 6
7 apr 1 6 9
8 apr 2 0 9
9 apr 3 2 9
10 apr 4 7 9
11 may 1 2 6
12 may 2 1 6
13 may 3 4 6
14 may 4 2 6
15 may 5 6 6
与for
循环的等价物可以是:
for (i in 1:nrow(df)) {
df$N[i] <- switch(as.character(df$mth[i]),
"apr" = 9,
"feb" = 12,
"jan" = 16,
"mar" = 6,
"may" = 6)
}
答案 1 :(得分:0)
我同意Sab使用data.table
。我想你的预期输出可能有一个拼写错误,所以我在下面的例子中包含了几个不同的选项:
library(data.table)
df <- data.frame(mnth = c("jan", "feb", "feb", "mar", "mar",
"mar", "apr", "apr", "apr", "apr",
"may", "may", "may", "may", "may"),
n = c(1, 1, 2, 1, 2, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5),
value = c(5, 1, 3, 2, 8, 0, 6, 0, 2, 7, 2, 1, 4, 2, 6))
setDT(df) # converts data.frame to data.table
df[,.(sum_n = sum(n), # adds up the 'n' column
sum_value = sum(value), # adds up the 'value' column
count_row = .N), by=mnth] # counts the number of rows for each value of 'mnth'
这会给您以下结果:
mnth sum_n sum_value count_row
1: jan 1 5 1
2: feb 3 4 2
3: mar 6 10 3
4: apr 10 15 4
5: may 15 15 5
修改强>
在海报澄清之后,这是工作代码:
df[,.(sum_value = sum(value)), by = .(n)]
这给出了以下结果:
> df[,.(sum_value = sum(value)), by = .(n)]
n sum_value
1: 1 16
2: 2 12
3: 3 6
4: 4 9
5: 5 6
答案 2 :(得分:0)
以下是使用data.table
和merge
的解决方案 - 非常简单:
library(data.table)
dt1 <- as.data.table(df)
dt2 <- dt2 <- data.table(mnth = c('jan', 'feb', 'mar', 'apr', 'may'),
N = c(16, 12, 6, 9, 6))
> merge(dt, dt2, by = 'mnth', all = T, fill = T)
mnth n value N
1: apr 1 6 9
2: apr 2 0 9
3: apr 3 2 9
4: apr 4 7 9
5: feb 1 1 12
6: feb 2 3 12
7: jan 1 5 16
8: mar 1 2 6
9: mar 2 8 6
10: mar 3 0 6
11: may 1 2 6
12: may 2 1 6
13: may 3 4 6
14: may 4 2 6
15: may 5 6 6
如果您只想要观察计数和列总和,可以在by
中使用data.table
参数:
> dt[, .(nsum = sum(n), valsum = sum(value), obs = .N), by = mnth]
mnth nsum valsum obs
1: jan 1 5 1
2: feb 3 4 2
3: mar 6 10 3
4: apr 10 15 4
5: may 15 15 5