Question

我有一个关于每个subj的计算方法的问题。

我的数据框如下：

   subj entropy n_gambles trial response   rt
1     0    high         2     0   sample 4205
2     0    high         2     0   sample  676
3     0    high         2     0     skip    0
4     0    high         2     1   sample  883
5     0    high         2     1   sample  697
6     0    high         2     1     skip    0
7     0    high         2     2   sample 1493
8     0    high         2     2   sample  507
9     0    high         2     2     skip    0
10    0    high         2     3   sample 1016

我希望找出每个subj的采样方法。

我已经把它归到了这里，但我不知道接下来是什么代码。

注意：每个subj的采样比例是不同的。

  subj trial n_gambles entropy response n_sample
2497    0     0         2    high   sample        2
2498    1     0         2    high   sample        0
2499    2     0         2    high   sample        0
2500    3     0         2    high   sample        0
2501    4     0         2    high   sample       27
2502    5     0         2    high   sample        0
2503    6     0         2    high   sample        0
2504    7     0         2    high   sample        0
2505    8     0         2    high   sample       19
2506    9     0         2    high   sample        0
2507   10     0         2    high   sample        0

以下是我目前的代码。

rm(list=ls())

# Import 'sub.csv' data file into a dataframe
data_subj <- read.csv ('subj.csv')
head (data_subj)

# Import 'response.csv' data file into a dataframe
data_response <- read.csv ('response.csv')
head(data_response)

# Merge 'response' and 'trial'
data <- merge (data_subj, data_response, by='subj')
head(data)


data <- as.data.frame(table(data$subj, data$trial, data$n_gambles, data$entropy, data$response))
colnames(data) <- c('subj', 'trial', 'n_gambles', 'entropy', 'response', 'n_sample')

# Subset for "sample"
data <- data[ data$response == "sample",]
head(data)

有人可以帮帮我吗？

我希望输出看起来像这样：

subj trial n_gambles entropy response n_sample  mean_sample/trials
  0     0         2    high   sample        2             
  1     0         2    high   sample        0
  2     0         2    high   sample        0
  3     0         2    high   sample        0

Answer 1

这类似于earlier question的答案：

library(plyr)
ddply(df,.(subj),summarize,mymean=(length(which(response=="sample")))/6)
 subj   mymean
1    0 1.166667

计算r中每个subj的平均值

1 个答案: