我有以下数据dfs_alltasks:
by_hour task
1 0 Apple Receiving
2 0 Apple Receiving
3 0 Orange Receiving
4 0 Banana Receiving
5 0 Banana Receiving
6 0 Orange Receiving
7 1 Orange Receiving
8 1 Banana Receiving
9 1 Banana Receiving
10 1 Banana Receiving
11 1 Banana Receiving
12 1 Banana Receiving
13 1 Orange Receiving
14 2 Banana Receiving
15 3 Banana Receiving
我喜欢在“ by_hour”列中进行分组,同时总结并返回编号。小组中发生的任务,我应该得到这样的东西:
by_hour task count
1 0 Apple Receiving 2
2 0 Orange Receiving 2
3 0 Banana Receiving 2
4 1 Orange Receiving 2
5 1 Banana Receiving 5
6 2 Banana Receiving 1
7 3 Banana Receiving 1
我尝试过: dfs_alltasks%>%group_by(by_hour)%>%summarise_all(no_rows = length(task))
但是我收到“ list2(...)中的错误:未找到对象'任务'”的错误
答案 0 :(得分:3)
您不需要分组依据
library(tidyverse)
df_example <-
structure(list(
by_hour = c(0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1,
1, 2, 3),
task = c(
"Apple Remaining",
"Apple Remaining",
"Orange Remaining",
"Banana Remaining",
"Banana Remaining",
"Orange Remaining",
"Orange Remaining",
"Banana Remaining",
"Banana Remaining",
"Banana Remaining",
"Banana Remaining",
"Banana Remaining",
"Orange Remaining",
"Banana Remaining",
"Banana Remaining"
)
),
class = "data.frame",
row.names = c(NA, -15L))
df_example %>%
count(by_hour,task)
#> by_hour task n
#> 1 0 Apple Remaining 2
#> 2 0 Banana Remaining 2
#> 3 0 Orange Remaining 2
#> 4 1 Banana Remaining 5
#> 5 1 Orange Remaining 2
#> 6 2 Banana Remaining 1
#> 7 3 Banana Remaining 1
由reprex package(v0.3.0)于2020-06-06创建
答案 1 :(得分:1)
尝试一下:
library(tibble)
library(dplyr)
data <- tibble::tribble(
~by_hour, ~task,
0 , "Apple Receiving",
0 , "Apple Receiving",
0 , "Orange Receiving",
0 , "Banana Receiving",
0 , "Banana Receiving",
0 , "Orange Receiving",
1 , "Orange Receiving",
1 , "Banana Receiving",
1 , "Banana Receiving",
1 , "Banana Receiving",
1 , "Banana Receiving",
1 , "Banana Receiving",
1 , "Orange Receiving",
2 , "Banana Receiving",
3 , "Banana Receiving")
data %>% group_by(by_hour,task) %>% summarize(count=n()) %>% ungroup()
答案 2 :(得分:1)
请考虑使用dput()
df <- structure(list(by_hour = c(0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1,
1, 2, 3), task = c("Apple Remaining", "Apple Remaining", "Orange Remaining",
"Banana Remaining", "Banana Remaining", "Orange Remaining", "Orange Remaining",
"Banana Remaining", "Banana Remaining", "Banana Remaining", "Banana Remaining",
"Banana Remaining", "Orange Remaining", "Banana Remaining", "Banana Remaining"
)), class = "data.frame", row.names = c(NA, -15L))
您可以使用dplyr
包和group_by
作为变量。
library(dplyr)
df %>%
group_by(by_hour, task) %>%
count %>%
ungroup
结果
by_hour task n
<dbl> <chr> <int>
1 0 Apple 2
2 0 Banana 2
3 0 Orange 2
4 1 Banana 5
5 1 Orange 2
6 2 Banana 1
7 3 Banana 1
答案 3 :(得分:1)
我们也可以使用
library(data.table)
setDT(df)[, .(n = .N), .(by_hour, task)]