当分组变量是字符类而不是因素时,为什么R中的quickpsy需要更长的时间来适应?

时间:2015-08-20 03:54:11

标签: r dplyr

quickpsy具有相同名称的功能,可以将心理测量功能与您提供的数据集相匹配,并且还可以执行自举。我注意到,当分组变量属于字符类时,运行所需的时间比分组变量是因子时要长。这是为什么?

这是一个例子来说明它。首先,我使用分组变量制作数据框,'条件'作为一个字符变量:

charData <- data.frame(
  condition = rep(c('pre','post'), each=6),
  comparison = rep(c(16, 18, 19, 21, 22, 24), 2),
  nGreater = c(2, 8, 16, 26, 34, 38, 0, 6, 12, 24, 32, 36),
  nTrials = rep(40, 12),
  stringsAsFactors = FALSE)

检查班级类型:

str(charData)
#'data.frame':  12 obs. of  4 variables:
# $ condition : chr  "pre" "pre" "pre" "pre" ...
# $ comparison: num  16 18 19 21 22 24 16 18 19 21 ...
# $ nGreater  : num  2 8 16 26 34 38 0 6 12 24 ...
# $ nTrials   : num  40 40 40 40 40 40 40 40 40 40 ...

我的机器需要约7秒才能运行快速功能,显示进度条:

charpsy <- quickpsy(charData, comparison, nGreater, nTrials, .(condition), B=100)
# |==============================================================================|100% ~0 s remaining

然后我创建完全相同的数据框,但是使用分组变量,&#39;条件&#39;作为一个因素:

factorData <- data.frame(
  condition = rep(c('pre','post'), each=6),
  comparison = rep(c(16, 18, 19, 21, 22, 24), 2),
  nGreater = c(2, 8, 16, 26, 34, 38, 0, 6, 12, 24, 32, 36),
  nTrials = rep(40, 12),
  stringsAsFactors = TRUE)

检查班级类型:

str(factorData)
#'data.frame':  12 obs. of  4 variables:
# $ condition : Factor w/ 2 levels "post","pre": 2 2 2 2 2 2 1 1 1 1 ...
# $ comparison: num  16 18 19 21 22 24 16 18 19 21 ...
# $ nGreater  : num  2 8 16 26 34 38 0 6 12 24 ...
# $ nTrials   : num  40 40 40 40 40 40 40 40 40 40 ...

现在快速运行速度更快,甚至不显示进度条:

factorpsy <- quickpsy(factorData, comparison, nGreater, nTrials, .(condition), B=100)

0 个答案:

没有答案