在R中增加样本量时,使用sample()而不进行多次替换

时间:2016-07-26 14:10:29

标签: r sample

我想从名为s = SingleList() s.add(1) s.add(2) s.add(3) s.add(4) s.add(5) s.printList() s.add(6) s.add(10) s.remove(5) s.remove(2) s.remove(1) s.printList() s.insert(4,9) s.printList() 的向量中获取“随机”样本,但增加data且无需替换。

为了说明我的观点size,例如:

data

我需要的是通过增加采样大小(从size = 2开始)获得不同的采样向量,例如2,但不同向量之间没有重复,并将所有内容存储到列表中,以便结果看起来像这样:

data<-c("a","s","d","f","g","h","j","k","l","x","c","v","b","n","m")

到目前为止我所拥有的是:

sample_1<-c("s","d")
sample_2<-c("s","d","a","f")
sample_3<-c("s","d","a","f","m","n")
sample_4<-c("s","d","a","f","m","n","l","c")
sample_5<-c("s","d","a","f","m","n","l","c","j","x")
sample_6<-c("s","d","a","f","m","n","l","c","j","x","v","k")
sample_7<-c("s","d","a","f","m","n","l","c","j","x","v","k","g","b")
sample_8<-c("s","d","a","f","m","n","l","c","j","x","v","k","g","b","h")
samples<-list(sample_1,sample_2,sample_3,sample_4,sample_5,sample_6,sample_7,sample_8)

不起作用的是增加样本量,但保留前面步骤的样本,并使用包含所有观察结果的最后一个列表元素。 这样的事情可能吗?

2 个答案:

答案 0 :(得分:5)

我不确定我是否理解正确,但也许您只需要对数据进行一次加扰:

data = letters
data_random = sample(data)
sapply(seq(from=2, to=length(data), by=2),
       function (x) data_random[1:x],
       simplify = FALSE)

答案 1 :(得分:3)

在您对其他答案发表评论后,我认为我得到了您想要实现的目标,因此扩展我以前的代码我最终会:

data<-c("a","s","d","f","g","h","j","k","l","x","c","v","b","n","m")
set.seed(123)
nbitems=length(data)/2+length(data)%%2
results=vector("list",nbitems)

results[[1]] <- sample(data,2) # get first sample
for (i in 2:nbitems) { # Loop for each result
  samplesavail <- data[!data %in% results[[i-1]]] # Reduce the samples available
  results[[i]] <- c(results[[i-1]], sample( samplesavail, min( length(samplesavail), 2) ) ) # concatenate a new sample, size depends on step and remaining samples available.
}

希望这符合您的预期用途:

> results
[[1]]
[1] "n" "f"

[[2]]
[1] "n" "f" "a" "g"

[[3]]
[1] "n" "f" "a" "g" "m" "v"

[[4]]
[1] "n" "f" "a" "g" "m" "v" "x" "l"

[[5]]
 [1] "n" "f" "a" "g" "m" "v" "x" "l" "b" "j"

[[6]]
 [1] "n" "f" "a" "g" "m" "v" "x" "l" "b" "j" "k" "h"

[[7]]
 [1] "n" "f" "a" "g" "m" "v" "x" "l" "b" "j" "k" "h" "d" "s"

[[8]]
 [1] "n" "f" "a" "g" "m" "v" "x" "l" "b" "j" "k" "h" "d" "s" "c"

以前的方法:

如果我理解你(但很不确定):

data<-c("a","s","d","f","g","h","j","k","l","x","c","v","b","n","m")
set.seed(123) # fix the seed for repro of answer, remove in real case
nbitems=length(data)/2+length(data)%%2 # Get how much entries we should have when stepping by 2
results=vector("list",nbitems) # preallocate the list (as we'll start by end)
results[[nbitems]] = sample(data,length(data)) # sample the datas
for (i in nbitems:2) {
  results[[i-1]] <- results[[i]][1:(length(results[[i]]) - 2)] # for each iteration, take down the 2 last entries.
}

这会给出一个条目作为第一个结果。

注意到这与@sbstn回答的想法相同,但采用更复杂的向后方式,以防万一它可以有一些价值。