从行的子集中随机选择

时间:2013-05-06 13:28:25

标签: r

我在块[[i]]中有数据,其中i = 4到6就像这样

  Stimulus Response   PM
  stretagost     s  <NA>
  colpublo       s  <NA>
  zoning         d  <NA>
  epilepsy       d  <NA>
  resumption     d  <NA>
  incisive       d  <NA>

每个块[[i]]中有440行。

目前我的脚本为每15个试验中的一个随机选择的项目做了一些事情(除了前110个试验每110个,我也设置好所以我永远不能选择小于2的行),每个块[[一世]]。

我希望能够做的是每15次试验中的1个项目,只从那些响应==“d”的那些中随机选择。也就是说,我不希望我的随机选择对于响应==“s”的行进行操作。我不知道如何实现这一点,但这是我到目前为止的脚本,它只是从每个15中随机选择1行:

PMpositions <- list()
for (i in 4:6){ 
  positions <- c() 
  x <- 0
  for (j in c(seq(5, 110-15, 15),seq(115, 220-15, 15),seq(225, 330-15, 15),seq(335,440-15, 15)))
  {  
    sub.samples <- setdiff(1:15 + j, seq(x-2,x+2,1))
    x <- sample(sub.samples, 1)
    positions <- c(positions,x)
  }  
  PMpositions[[i]] <- positions
  blocks[[i]]$Response[PMpositions[[i]]] <- Wordresponse
  blocks[[i]]$PM[PMpositions[[i]]] <- PMresponse 
  blocks[[i]][PMpositions[[i]],]$Stimulus <- F[[i]]
}

我最终像这样处理它

PMpositions <- list()
for (i in 1:3){ 
startingpositions <- c(seq(5, 110-15, 15),seq(115, 220-15, 15),seq(225, 330-15,    
15),seq(335, 440-15, 15))
positions <- c() 
x <- 0
for (j in startingpositions)
{  
sub.samples <- setdiff(1:15 + j, seq(x-2,x+2,1))
x <- sample(sub.samples, 1)
positions <- c(positions,x)
} 
repeat {
positions[which(blocks[[i]][positions,2]==Nonwordresponse)]<- 
startingpositions[which(blocks[[i]][positions,2]==Nonwordresponse)]+sample(1:15, 
size=length(which(blocks[[i]][positions,2]==Nonwordresponse)), replace = TRUE)
distancecheck<- which ( abs( c(positions[2:length(positions)],0)-positions ) < 2) 
if (length(positions[which(blocks[[i]][positions,2]==Nonwordresponse)])== 0  & length  
(distancecheck)== 0) break
 }
PMpositions[[i]] <- positions
blocks[[i]]$Response[PMpositions[[i]]] <- Wordresponse
blocks[[i]]$PM[PMpositions[[i]]] <- PMresponse 
blocks[[i]][PMpositions[[i]],]$Stimulus <- as.character(NF[[i]][,1])
Nonfocal[[i]] <- blocks[[i]]
}

我意识到当遇到重复循环时,有时我连续响应15“s”!卫生署。很高兴能够解决这个问题,但是对于我需要的东西是可以的,当我遇到困难时我只是再次运行它(d / s的位置是随机生成的)。

2 个答案:

答案 0 :(得分:1)

编辑:这是一种不同的方法,只对'd'行进行采样。它是相当自定义的代码,但主要的想法是使用prob参数仅对“Response”==“d”的行进行采样,并将所有其他行的采样可能设置为零。

Response <- rep(c("s","d"),220)
chunk <- sort(rep(1:30,15))[1:440] # chunks of 15 up to 440

# function to randomly sample from each set of 15 rows
sampby15 <- function(i){
    sample((1:440)[chunk==i], 1, 
        # use the `prob` argument to only sample 'd' values
        prob=rep(1,length=440)[chunk==i]*(Response=="d")[chunk==i])
}
s <- sapply(1:15,FUN=sampby15) # apply to each chunk to get sample rows
Response[s] # confirm only 'd' values

# then you have code to do whatever to those rows...

答案 1 :(得分:1)

因此,您希望在每个块上运行的真正基本功能如下:

subsetminor <- function(dataset, only = "d", rows = 1) { 
  remainder <- subset(dataset, Response == only)
  return(remainder[sample(1:nrow(remainder), size = rows), ])
}

我们可以稍微修改它以避免彼此相邻的行:

subsetminor <- function(dataset, only = "d", rows = 1) { 
  remainder <- subset(dataset, Response == only)
  if(rows > 1) {
    sampled <- sample(1:nrow(remainder), size = rows)
    pairwise <- t(combn(sampled, 2))
    while(any(abs(pairwise[, 1] - pairwise[, 2]) <= 2)) {
      sampled <- sample(1:nrow(remainder), size = rows)
      pairwise <- t(combn(sampled, 2))
    }
  }
  out <- remainder[sampled, ]
  return(out)
}

以上内容可以简化/干掉很多,但它应该完成工作。