计算值在向量中出现的次数[具有R中的条件]

时间:2017-07-28 14:13:55

标签: r count

我有以下数据集,并希望计算某个条件发生在向量中的次数:

structure(list(ID = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 
3L, 3L), Stimuli = c(1L, 0L, 0L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 
0L, 1L)), .Names = c("ID", "Stimuli"), class = c("tbl_df", "tbl", 
"data.frame"), row.names = c(NA, -12L), spec = structure(list(
    cols = structure(list(ID = structure(list(), class = 
c("collector_integer", 
    "collector")), Stimuli = structure(list(),
class = c("collector_integer", 
    "collector"))), .Names = c("ID", "Stimuli")), default = structure(list(),
class = c("collector_guess", 
    "collector"))), .Names = c("cols", "default"), class = "col_spec"))

仅对每个ID进行单独计数,并且仅当Stimuli的值为1时才会计算结果。然后将结果汇总到一个额外的行中,如下所示:

ID  Stimuli Count
1      1    1
1      0    0
1      0    0
1      1    2
2      1    1
2      1    2
2      0    0
2      1    3
3      0    0
3      1    1
3      0    0
3      1    2

我知道as.data.frame(table(df))获取频率,但在这种情况下,我想保留每一行,并且只计算每个ID序列。

3 个答案:

答案 0 :(得分:4)

我们可以使用group_by累积和(cumsum)基于'{1}}条件对'刺激'进行1

ifelse

或另一个选项是library(dplyr) d1 %>% group_by(ID) %>% mutate(Count = ifelse(Stimuli == 1, cumsum(Stimuli), 0)) # A tibble: 12 x 3 # Groups: ID [3] # ID Stimuli Count # <int> <int> <dbl> # 1 1 1 1 # 2 1 0 0 # 3 1 0 0 # 4 1 1 2 # 5 2 1 1 # 6 2 1 2 # 7 2 0 0 # 8 2 1 3 # 9 3 0 0 #10 3 1 1 #11 3 0 0 #12 3 1 2

data.table

或使用library(data.table) setDT(df1)[Stimuli == 1, Count := seq_len(.N), by = ID][is.na(Count), Count := 0][]

中的ave
base R

答案 1 :(得分:3)

您可以使用data.table包:

 library(data.table)
 setDT(df)[, Count := cumsum(Stimuli)*Stimuli, by=ID]


#     ID Stimuli Count 
#  1:  1       1     1 
#  2:  1       0     0 
#  3:  1       0     0 
#  4:  1       1     2 
#  5:  2       1     1 
#  6:  2       1     2 
#  7:  2       0     0 
#  8:  2       1     3 
#  9:  3       0     0 
# 10:  3       1     1 
# 11:  3       0     0 
# 12:  3       1     2

答案 2 :(得分:2)

仅限基础R,有点复杂。我将命名为dat

dat1 <- dat
dat1$Count <- 0
sp <- split(dat1, dat1$ID)
res <- do.call(rbind, lapply(sp, function(x){
    inx <- x$Stimuli != 0
    x$Count[inx] <- cumsum(x$Stimuli[inx])
    x
}))
res