Question

我坚持这个：

我有一个包含以下属性的数据框：

变量类型（值：“P”，“T”，“I”）
变量ID（主题ID）
变量RT（反应时间）

看起来像这样：

id    type    rt
1     T       333
1     P       912
1     P       467
1     I       773
1     I       123
...
2     P       125
2     I       843
2     T       121
2     P       982
...

变量type的顺序对于每个主题是随机的，但每个主题具有相同的每种类型的数量。我想要的是选择前两个RT值，其中type=="P"为每个参与者，然后平均发生次数，这样我得到第一次出现P的所有参与者的平均RT，以及第二次出现的平均值P。

到目前为止，据说有20名参与者，我想首次提取20个RT，第二次提取20个RT。

我尝试了tapply，聚合，for循环和简单的子集，但这些平均“太早”或失败，因为每个主题的顺序是随机的。

Answer 1

尝试

 devtools::install_github("hadley/dplyr")
 library(dplyr)
   df%>%
      group_by(id) %>% 
      filter(type=="P") %>% 
      slice(1:2)%>% 
      mutate(N=row_number()) %>%
      group_by(N) %>% 
      summarise(rt=mean(rt))
     #Source: local data frame [2 x 2]

   # N    rt
   #1 1 518.5
   #2 2 724.5

或使用data.table

 library(data.table)
  setDT(df)[type=="P", list(rt=rt[1:2], N=seq_len(.N)), by=id][, 
                                      list(Meanrt=mean(rt)), by=N] 
  #   N Meanrt
  #1: 1  518.5
  #2: 2  724.5

或使用aggregate

中的base R

  df1 <- subset(df, type=="P")
  df1$indx <- with(df1, ave(rt, id, FUN=seq_along))
  aggregate(rt~indx, df1[df1$indx %in% 1:2,], FUN=mean)
  #  indx    rt
  #1    1 518.5
  #2    2 724.5

数据

 df <- structure(list(id = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), type = c("T", 
 "P", "P", "I", "I", "P", "I", "T", "P"), rt = c(333L, 912L, 467L, 
 773L, 123L, 125L, 843L, 121L, 982L)), .Names = c("id", "type", 
 "rt"), class = "data.frame", row.names = c(NA, -9L))

Answer 2

我希望我说得对，使用dplyr：

df %>% 
group_by(id, type) %>% 
mutate(occ=1:n()) %>% 
group_by(type, occ) %>% 
summarise(av=mean(rt)) %>%
filter(type=="P")

Source: local data frame [2 x 3]
Groups: type

  type occ    av
1    P   1 518.5
2    P   2 724.5

从R中的数据框中选择每个特定事件的值

2 个答案:

数据