我有一个关于每个subj的计算方法的问题。
我的数据框如下:
subj entropy n_gambles trial response rt
1 0 high 2 0 sample 4205
2 0 high 2 0 sample 676
3 0 high 2 0 skip 0
4 0 high 2 1 sample 883
5 0 high 2 1 sample 697
6 0 high 2 1 skip 0
7 0 high 2 2 sample 1493
8 0 high 2 2 sample 507
9 0 high 2 2 skip 0
10 0 high 2 3 sample 1016
我希望找出每个subj的采样方法。
我已经把它归到了这里,但我不知道接下来是什么代码。
注意:每个subj的采样比例是不同的。
subj trial n_gambles entropy response n_sample
2497 0 0 2 high sample 2
2498 1 0 2 high sample 0
2499 2 0 2 high sample 0
2500 3 0 2 high sample 0
2501 4 0 2 high sample 27
2502 5 0 2 high sample 0
2503 6 0 2 high sample 0
2504 7 0 2 high sample 0
2505 8 0 2 high sample 19
2506 9 0 2 high sample 0
2507 10 0 2 high sample 0
以下是我目前的代码。
rm(list=ls())
# Import 'sub.csv' data file into a dataframe
data_subj <- read.csv ('subj.csv')
head (data_subj)
# Import 'response.csv' data file into a dataframe
data_response <- read.csv ('response.csv')
head(data_response)
# Merge 'response' and 'trial'
data <- merge (data_subj, data_response, by='subj')
head(data)
data <- as.data.frame(table(data$subj, data$trial, data$n_gambles, data$entropy, data$response))
colnames(data) <- c('subj', 'trial', 'n_gambles', 'entropy', 'response', 'n_sample')
# Subset for "sample"
data <- data[ data$response == "sample",]
head(data)
有人可以帮帮我吗?
我希望输出看起来像这样:
subj trial n_gambles entropy response n_sample mean_sample/trials
0 0 2 high sample 2
1 0 2 high sample 0
2 0 2 high sample 0
3 0 2 high sample 0
答案 0 :(得分:0)
这类似于earlier question的答案:
library(plyr)
ddply(df,.(subj),summarize,mymean=(length(which(response=="sample")))/6)
subj mymean
1 0 1.166667