我有一个包含三列的数据帧(df),如下所示:
结构:
id id1 age
A1 a1 32
A1 a2 45
A1 a3 45
A1 a4 12
A2 b1 15
A2 b5 34
A2 b64 17
预期输出:
id count count1
A1 4 1
A2 3 2
逻辑:
当前代码:
library(dplyr)
df_summarized <- df %>%
group_by(id) >%>
summarise(count = n(),count1 = count(age<21))
问题:
Error: no applicable method for 'group_by_' applied to an object of class "logical"
答案 0 :(得分:4)
我们需要执行df %>%
group_by(id) %>%
summarise(count = n(),count1 = sum(age < 21))
# A tibble: 2 × 3
# id count count1
# <chr> <int> <int>
#1 A1 4 1
#2 A2 3 2
count
data.frame
适用于tbl_df
或summarise
,而不是data.table
或使用library(data.table)
setDT(df)[, .(count = .N, count1 = sum(age < 21)), id]
base R
或cbind(count = rowSums(table(df[-2])), count1 = as.vector(rowsum(+(df$age < 21), df$id)))
# count count1
#A1 4 1
#A2 3 2
aggregate
或根据sum
do.call(data.frame, aggregate(age~id, df, FUN =
function(x) c(count = length(x), count1 = sum(x<21))))
aggregate
注意:以上所有方法都为数据集提供了适当的列。这将在do.call(data.frame
中特别注明。这就是输出列即矩阵被转换为具有{{1}}
答案 1 :(得分:4)
使用基数R,我们可以使用aggregate
查找每个组的行数(id
)以及值小于21的行数
aggregate(age~id, df, function(x) c(count = length(x),
count1 = length(x[x < 21])))
# id age.count age.count1
#1 A1 4 1
#2 A2 3 2