R中多个概率向量的加权抽样

时间:2014-12-08 08:31:59

标签: r sampling weighted

我有类似的问题:

Weighted sampling with 2 vectors

我现在有一个数据集,每个观察包含1000个观察值和4个列。我想从替换原始数据集中抽取200个观测值。

但问题是:我需要为每列分配不同的概率向量。例如,对于第一列。我想要等概率c(0.001,0.001,0.001,0.001 ......)。对于第二列,我想要一些不同的东西,如c(0.0005,0.0002,......)。当然,每个概率向量总和为1。

我知道样本可以使用一个向量。但我不确定其他命令。请帮帮我!

提前谢谢! Colamonkey

1 个答案:

答案 0 :(得分:0)

具有样本概率的数据框

# in your case the rows are 1000 and the columns 4, 
# but it is just to show the procedure
samp_prob <- data.frame(A = rep(.25, 4), B = c(.5, .1, .2, .2), C = c(.3, .6, .05, .05))

从替换

采样的值的数据框
df <- data.frame(a = 1:4, b = 2:5, c = 3:6)

取样

sam <- mapply(function(x, y) sample(x, 200, T, y), df, samp_prob)
head(sam)
     a b c
[1,] 4 5 6
[2,] 1 2 4
[3,] 1 2 4
[4,] 4 4 4
[5,] 4 4 4
[6,] 1 2 4

# you can also write (it is equivalent):
mapply(df, samp_prob, FUN = sample, size = 200, replace = T)