使用以下反应时间数据(为说明目的而简化):
>dt
subject trialnum blockcode values.trialtype latency correct
1 1 1 practice cueswitch 3020 1
2 1 1 test cuerep 4284 1
3 1 21 test cueswitch 2094 1
4 1 34 test cuerep 3443 1
5 1 50 test taskswitch 3313 1
6 2 1 practice cueswitch 3020 1
7 2 1 test cuerep 1109 1
8 2 21 test cueswitch 3470 1
9 2 34 test cuerep 2753 1
10 2 50 test taskswitch 3321 1
我一直在使用data.table来获取连续试验子集的反应时间变量(由trialnum指定,在完整数据集中的范围从1到170):
dt1=dt[blockcode=="test" & correct==1, list(
RT1=.SD[trialnum>=1 & trialnum<=30 & values.trialtype=="cuerep", mean(latency)],
RT2=.SD[trialnum>=31 & trialnum<=60 & values.trialtype=="cuerep", mean(latency)]
), by="subject"]
输出
subject RT1 RT2
1: 1 4284 3443
2: 2 1109 2753
但是,当存在多于2个或3个子集时,为每个子集创建变量会变得乏味。如何更有效地指定这些子集?
答案 0 :(得分:2)
使用findInterval
或cut
对trialnum
列
一个例子
# set the key to use binary search
setkey(dt, blockcode,correct,values.trialtype)
# the subset you want
dt1 <- dt[.('test',1,'cuerepetition')]
# use cut to define subsets
dt2 <- dt1[,list(latency = mean(latency)),
by=list(subject, trialset = cut(trialnum,seq(0,180,by=30)))]
dt2
# subject trialset latency
# 1: 1 (0,30] 4284
# 2: 1 (30,60] 3443
# 3: 2 (0,30] 1109
# 4: 2 (30,60] 2753
#If you want separate columns, it is a simple as using `dcast`
library(reshape2)
dcast(dt2,subject~trialset, value.var = 'latency')
# subject (0,30] (30,60]
# 1 1 4284 3443
# 2 2 1109 2753