Question

我的数据框有两列：

structure(list(lowage = c(45, 15, 9, 51, 22, 45, 4, 4, 9, 25), 
    highage = c(50, 21, 14, 60, 24, 50, 8, 8, 14, 30)), .Names = c("lowage", 
"highage"), row.names = c(NA, 10L), class = "data.frame")

数据框如下所示：

   lowage highage
1      45      50
2      15      21
3       9      14
4      51      60
5      22      24
6      45      50
7       4       8
8       4       8
9       9      14
10     25      30

我正在尝试为两列之间的每一行获取一个随机数，并将其另存为第三列。

我尝试了以下内容：

df$age <- sample(df$lowage:df$highage,1)

这给了我以下错误：

Error in `$<-.data.frame`(`*tmp*`, age, value = c(47L, 50L, 49L, 48L,  : 
  replacement has 6 rows, data has 795
In addition: Warning messages:
1: In df$lowage:dfhighage :
  numerical expression has 795 elements: only the first used
2: In dflowage:df$highage :
  numerical expression has 795 elements: only the first used

我尝试使用for循环：

for (i in 1:length(df$lowage)) {
 df$age[i] <- round(sample(df$lowage[i]:df$highage[i]),1)
}

虽然这会创建一个具有随机年龄值的列年龄，但它仍会给我以下警告：

Warning messages:
1: In df$age[i] <- round(sample(df$lowage[i]:df$highage[i]),  ... :
  number of items to replace is not a multiple of replacement length

虽然我可以看到df中每行的值，但我不确定此警告是否会对列产生影响。

Answer 1

我们可以apply使用MARGIN=1（逐行）并在两列之间生成一系列数字，然后使用sample从中选择任意一个数字。

df$random_number <- apply(df, 1, function(x) sample(seq(x[1], x[2]), 1))

df

#   lowage highage random_number
#1      45      50            47
#2      15      21            21
#3       9      14             9
#4      51      60            55
#5      22      24            23
#6      45      50            47
#7       4       8             7
#8       4       8             8
#9       9      14            14
#10     25      30            27

或与mapply

类似的概念

df$random_number <- mapply(function(x, y) sample(seq(x, y), 1), 
                    df$lowage, df$highage)

获取R中两个列值之间的随机值

1 个答案: