Question

我是大家！

我正在为我的博士学位工作发现R世界，当我想实现循环以简化分析时，我遇到了几个问题。

我的数据框是：

'data.frame':   3581 obs. of  8 variables:
 $ Date          : Factor w/ 7 levels "03-03-17","10-02-17",..: 
 $ Experimentator: Factor w/ 9 levels "BURLET","DECHAUD",..: 
 $ Origin        : Factor w/ 3 levels "FRANCE","JAPAN",..: 
 $ City          : Factor w/ 6 levels "MONTPELLIER",..: 
 $ Lineage       : Factor w/ 27 levels "L21","L22","L26",..:
 $ Sex           : Factor w/ 2 levels "Female","Male":
 $ ccr           : int  1183 1813 1866 1745 1210 1463 2477 1506

前6个是我的因素，也是我的最后一个量变量。我需要同时处理几个因素，然后当我想做一个shapiro.test时，例如：与：

by(data$ccr, c(data$Date, data$Sex, data$Lineage), shapiro.test() )
Error in tapply(seq_len(3581L), list(`c(data$Date, data$Sex, 
data$Lineage)` = c(2L,  : the arguments must have the same length

使用for循环对我来说很难，所以我试着写：

for(sex in levels(data$Sex)){
  for(date in levels(data$Date)){
    for(lineage in levels(data$Lineage)){
      shapiro.test(data$ccr[,lineage])
    }
  }
}

我不知道如何增加循环...

感谢您的帮助！

Answer 1

在R中执行此操作不需要

(x1,x2,x..,xk)循环。我不认为使用for函数也是最好的方法。最简单的方法是使用dplyr基础架构：

by()

library(dplyr)

data %>% group_by(Sex, Date, Lineage) %>% filter(n() > 2) %>% summarise(shapiro_pvalue = shapiro.test(ccr)$p.value, shapiro_stat = shapiro.test(ccr)$statistic)处理filter(n() > 2)至少需要计算3个样本的事实。（致Rui Barradas的好处是可重复的例子！）

dplyr与shapiro.test完全不同，但如果你正在开始攻读博士学位并需要使用R，那么如果你想让生活变得更简单，就值得使用它。

Answer 2

您可以使用索引像这样运行

index <- 1

while(index != 3582){
  for(sex in levels(data$Sex)){
    for(date in levels(data$Date)){
      for(lineage in levels(data$Lineage)){
        shapiro.test(data$ccr[,lineage])
        index <- index + 1
      }
    }
  }
}

Answer 3

你可以使用基数R而不是by使用split/lapply来实现首先，一些假数据，其名称已更改为dat，因为data已经是R函数。

set.seed(9235)    # make it reproducible
n <- 3581
d <- seq(as.Date("2017-01-01"), as.Date("2017-12-31"), by = "day")
d <- format(d, "%d-%m-%y")
dat <- data.frame(
    Date = sample(d, n, TRUE),
    Experimentator = sample(LETTERS[1:9], n, TRUE),
    Origin = sample(LETTERS[11:13], n, TRUE),
    Lineage = sample(paste0("L", 1:27), n, TRUE),
    Sex = sample(c("F", "M"), n, TRUE),
    ccr = sample(3000, n, TRUE)
)

现在的代码。请注意，shapiro.test仅接受number of non-missing values must be between 3 and 5000.

的数据

sp <- split(dat$ccr, list(dat$Date, dat$Sex, dat$Lineage))
sp <- sp[which(sapply(sp, function(x) length(x) > 2))]

result <- lapply(sp, shapiro.test)

对于R

3 个答案: