Question

我正在构建代码来运行和管理可以在三个站点群组之一中的站点的抽样事件模拟。我使用rep()使用以下代码分配群组标识符（1,2或3）：

cohort <- rep(1:n.cohorts, n.sites)

我已将关键行放在第一位，但为了重现我的问题，您需要运行以下行，这些行在同类群组之间分配总数，以便进行rep()调用。

n.cohorts <- 3
s <- 10 # total available sites in this example

# different proportions of the total can be allocated to each cohort, for example 
prop.control <- 0.4 ; prop.int <- 0.4 ; prop.ref <- 1-(prop.int+prop.control)
n.control <- prop.control * s; n.int <- prop.int * s; n.ref <- prop.ref * s 
n.sites <- c(n.control, n.int, n.ref)

现在，n.sites本身返回

[1] 4 4 2

因此，当我再次运行cohort <- rep(1:n.cohorts, n.sites)来电时，我希望cohort成为10个项目的列表，如下所示：[1] 1 1 1 1 2 2 2 2 3 3。然而，我获得只有9：

> cohort
[1] 1 1 1 1 2 2 2 2 3

如果我运行相同的代码，其中n.sites被直接定义为n.sites <- c(4, 4, 2)，我得到了我期望的10个项目。我已经多次重做这件事来说服自己，在这两种情况下n.sites本身都会产生相同的结果。

任何人都能解释为什么会这样吗？非常感谢提前。

大卫

Answer 1

我认为这是R中那些算术不准确的问题之一。问题在于：

prop.ref <- 1-prop.int-prop.control
prop.ref*10
#[1] 2
floor(prop.ref*10)
#[1] 1

所以r认为prop.int+prop.control略大于0.8

您可以通过

修复它

cohort <- rep(1:n.cohorts, ceiling(n.sites))

但你是对的，它确实看起来像一个严重的错误编辑 - 抱歉意味着 SEEM 就像严重

为什么rep（）与这个简单的R例子行为不一致？

1 个答案: