我的数据框有两列:
structure(list(lowage = c(45, 15, 9, 51, 22, 45, 4, 4, 9, 25),
highage = c(50, 21, 14, 60, 24, 50, 8, 8, 14, 30)), .Names = c("lowage",
"highage"), row.names = c(NA, 10L), class = "data.frame")
数据框如下所示:
lowage highage
1 45 50
2 15 21
3 9 14
4 51 60
5 22 24
6 45 50
7 4 8
8 4 8
9 9 14
10 25 30
我正在尝试为两列之间的每一行获取一个随机数,并将其另存为第三列。
我尝试了以下内容:
df$age <- sample(df$lowage:df$highage,1)
这给了我以下错误:
Error in `$<-.data.frame`(`*tmp*`, age, value = c(47L, 50L, 49L, 48L, :
replacement has 6 rows, data has 795
In addition: Warning messages:
1: In df$lowage:dfhighage :
numerical expression has 795 elements: only the first used
2: In dflowage:df$highage :
numerical expression has 795 elements: only the first used
我尝试使用for循环:
for (i in 1:length(df$lowage)) {
df$age[i] <- round(sample(df$lowage[i]:df$highage[i]),1)
}
虽然这会创建一个具有随机年龄值的列年龄,但它仍会给我以下警告:
Warning messages:
1: In df$age[i] <- round(sample(df$lowage[i]:df$highage[i]), ... :
number of items to replace is not a multiple of replacement length
虽然我可以看到df中每行的值,但我不确定此警告是否会对列产生影响。
答案 0 :(得分:3)
我们可以apply
使用MARGIN=1
(逐行)并在两列之间生成一系列数字,然后使用sample
从中选择任意一个数字。
df$random_number <- apply(df, 1, function(x) sample(seq(x[1], x[2]), 1))
df
# lowage highage random_number
#1 45 50 47
#2 15 21 21
#3 9 14 9
#4 51 60 55
#5 22 24 23
#6 45 50 47
#7 4 8 7
#8 4 8 8
#9 9 14 14
#10 25 30 27
或与mapply
df$random_number <- mapply(function(x, y) sample(seq(x, y), 1),
df$lowage, df$highage)