我想知道“参与者”列中的每个组,何时“信号”列中的值“ 1”首次出现(由参与者)。值“ 1”的计数应指向该组。
这是示例数据帧
> dfInput <- data.frame(Participants=c( 'A','A','A','B','B','B','B','C','C'), Signal=c(0, 1, 1, 0, 0, 0, 1, 1,0))
> dfInput
Participants Signal
1 A 0
2 A 1
3 A 1
4 B 0
5 B 0
6 B 0
7 B 1
8 C 1
9 C 0
这是我正在寻找的输出:
> dfOutput <-data.frame(Participants=c( 'A','B','C'), RowNumberofFirst1=c(2, 4, 1))
> dfOutput
Participants RowNumberofFirst1
1 A 2
2 B 4
3 C 1
问题与此类似:Find first occurence of value in group using dplyr mutate 但是,我无法对其进行相应的修改以创建我的输出df
答案 0 :(得分:3)
我认为这就是您要寻找的
library(dplyr)
dfInput %>%
group_by(Participants) %>%
summarise(RowNumberofFirst1 = which(Signal == 1)[1])
答案 1 :(得分:2)
通过aggregate
aggregate(Signal~Participants, dfInput, function(i)which(i == 1)[1])
# Participants Signal
#1 A 2
#2 B 4
#3 C 1
答案 2 :(得分:1)
dfInput <- data.frame(Participants=c( 'A','A','A','B','B','B','B','C','C'),
Signal=c(0, 1, 1, 0, 0, 0, 1, 1,0))
library(dplyr)
dfInput %>%
group_by(Participants) %>% # for each Participant
summarise(NumFirst1 = min(row_number()[Signal == 1])) # get the minimum number of row where signal equals 1
# # A tibble: 3 x 2
# Participants NumFirst1
# <fct> <int>
# 1 A 2
# 2 B 4
# 3 C 1
如果您要返回已标识的行(即所有列值),可以使用以下方法:
set.seed(5)
dfInput <- data.frame(Participants=c( 'A','A','A','B','B','B','B','C','C'),
Signal=c(0, 1, 1, 0, 0, 0, 1, 1,0),
A = sample(c("C","D","F"),9, replace = T),
B = sample(c("N","M","K"),9, replace = T))
library(dplyr)
dfInput %>%
group_by(Participants) %>%
filter(row_number() == min(row_number()[Signal == 1])) %>%
ungroup()
# # A tibble: 3 x 4
# Participants Signal A B
# <fct> <dbl> <fct> <fct>
# 1 A 1 F N
# 2 B 1 D N
# 3 C 1 F M
因此,在这种情况下,您使用filter
为每个参与者返回等于最小行号的行,其中Signal
为1。
答案 3 :(得分:0)
使用tidyverse
:
dfInput%>%
group_by(Participants)%>%
mutate(max=cumsum(Signal),
RowNumberofFirst1=row_number())%>%
filter(max==1)%>%
top_n(-1,RowNumberofFirst1)%>%
select(Participants,RowNumberofFirst1)
# A tibble: 3 x 2
# Groups: Participants [3]
Participants RowNumberofFirst1
<fct> <int>
1 A 2
2 B 4
3 C 1
答案 4 :(得分:0)
以下是基于R
的解决方案:
dfInput <- data.frame(Participants=c( 'A','A','A','B','B','B','B','C','C'), Signal=c(0, 1, 1, 0, 0, 0, 1, 1,0))
tapply(dfInput$Signal, dfInput$Participants, FUN=function(x) min(which(x==1)))
# > tapply(dfInput$Signal, dfInput$Participants, FUN=function(x) min(which(x==1)))
# A B C
# 2 4 1
如果您想要一个数据框,可以这样做:
first1 <- tapply(dfInput$Signal, dfInput$Participants, FUN=function(x) min(which(x==1)))
data.frame(Participants=names(first1), f=first1)
这是data.table
的变体:
library("data.table")
setDT(dfInput)
dfInput[, which(Signal==1)[1], "Participants"]