对于示例数据框:
df1 <- structure(list(place = c("a", "a", "b", "b", "b", "b", "c", "c",
"c", "d", "d"), animal = c("cat", "bear", "cat", "bear", "pig",
"goat", "cat", "bear", "goat", "goat", "bear"), number = c(5,
6, 7, 4, 5, 6, 8, 5, 3, 7, 4)), .Names = c("place", "animal",
"number"), row.names = c(NA, -11L), spec = structure(list(cols = structure(list(
place = structure(list(), class = c("collector_character",
"collector")), animal = structure(list(), class = c("collector_character",
"collector")), number = structure(list(), class = c("collector_integer",
"collector"))), .Names = c("place", "animal", "number")),
default = structure(list(), class = c("collector_guess",
"collector"))), .Names = c("cols", "default"), class = "col_spec"), class = c("tbl_df",
"tbl", "data.frame"))
我想创建一个变量'sum',该变量将'number'列与'place'相加(不考虑动物),并将其添加到数据名人中。
以下命令:
df1$sum <- aggregate(df1$number, by=list(Category=df1$place), FUN=sum)
...尝试求和,但无法完成该功能,因为它只想按单个位置的数量进行报告(因此为什么会出现此错误):
Error in `$<-.data.frame`(`*tmp*`, sum, value = list(Category = c("a", :
replacement has 4 rows, data has 11
有什么主意如何将这个额外的列添加到我的数据框中?
答案 0 :(得分:1)
因为有小标题,请先使用dplyr解决方案。接下来是基本的R版本。
使用dplyr:
df1 %>%
group_by(place) %>%
mutate(sum_num = sum(number))
# A tibble: 11 x 4
# Groups: place [4]
place animal number sum_num
<chr> <chr> <dbl> <dbl>
1 a cat 5 11
2 a bear 6 11
3 b cat 7 22
4 b bear 4 22
5 b pig 5 22
6 b goat 6 22
7 c cat 8 16
8 c bear 5 16
9 c goat 3 16
10 d goat 7 11
11 d bear 4 11
使用基数R:
df1$sum_num <- ave(df1$number, df1$place, FUN = sum)
# A tibble: 11 x 4
place animal number sum_num
<chr> <chr> <dbl> <dbl>
1 a cat 5 11
2 a bear 6 11
3 b cat 7 22
4 b bear 4 22
5 b pig 5 22
6 b goat 6 22
7 c cat 8 16
8 c bear 5 16
9 c goat 3 16
10 d goat 7 11
11 d bear 4 11