这应该相当容易,但我找不到快速的方法。我想做的就是用随机数替换某个因子的某个级别(我从头开始建立一个数据帧,并希望某些级别的因子具有不同的值范围)。
data <- data.frame(
animal = sample(c("lion","tiger","bear"),50,replace=TRUE),
region = sample(c("north","south","east","west"),50,replace=T),
reports = sample(50:100,50,replace=T))
这样的东西不起作用,因为你必须指定要生成的元素数量
data$animal <- sub("lion",rnorm(15,10,2),data$animal)
发出警告:
Warning: In sub("lion", rnorm(15, 10, 2), data$animal) :
argument 'replacement' has length > 1 and only the first element will be used
有没有人有一个简单的方法可以做到这一点,还是不可能使用带有数字的“子”表达式?
答案 0 :(得分:1)
我不明白你为什么要这样,但我们走了。
set.seed(42)
data <- data.frame(
animal = sample(c("lion","tiger","bear"),50,replace=TRUE),
region = sample(c("north","south","east","west"),50,replace=T),
reports = sample(50:100,50,replace=T))
data$animal <- as.character(data$animal)
to.change <- data$animal=="lion"
data$animal[to.change] <- rnorm(sum(to.change),10,2)
# animal region reports
# 1 bear south 81
# 2 bear south 61
# 3 11.1619929953634 south 61
# 4 bear west 69
# 5 tiger north 98
# 6 tiger east 99
# 7 bear east 87
# 8 11.5363574756692 north 87
# 9 tiger south 77
# 10 bear east 50
# 11 tiger east 81
# 12 bear west 92
# 13 bear west 88
# 14 10.9275351770803 east 73
# 15 tiger west 77
# 16 bear north 77
# 17 bear south 50
# 18 8.22844740518064 west 68
# 19 tiger east 81
# 20 tiger north 92
# 21 bear north 68
# 22 7.80043820270429 north 70
# 23 bear north 79
# 24 bear south 80
# 25 13.0254140196099 north 86
# 26 tiger east 70
# 27 tiger north 96
# 28 bear south 99
# 29 tiger east 61
# 30 bear north 86
# 31 bear east 96
# 32 bear north 80
# 33 tiger south 82
# 34 bear east 97
# 35 10.5158428750641 west 93
# 36 bear east 79
# 37 10.1768804583192 north 91
# 38 9.75820692492182 north 55
# 39 bear north 88
# 40 tiger south 81
# 41 tiger east 57
# 42 tiger north 54
# 43 7.61134220967894 north 73
# 44 bear west 89
# 45 tiger west 87
# 46 bear east 91
# 47 bear south 58
# 48 tiger east 98
# 49 bear east 64
# 50 tiger east 57
修改强>
根据你的评论,你似乎真的想要这样的事情:
offense <- data.frame(animal=c("lion","tiger","bear"),
mean=c(35,25,10),
sd=c(3,2,1))
library(plyr)
data <- ddply(merge(data, offense),
.(animal),
transform,
attacks=rnorm(length(mean), mean=mean, sd=sd),
mean=NULL,
sd=NULL)
# animal region reports attacks
# 1 bear south 81 10.580996
# 2 bear south 61 10.768179
# 3 bear north 77 10.463768
# 4 bear west 69 9.114224
# 5 bear east 96 8.900219
# 6 bear north 80 11.512707
# 7 bear east 87 10.257921
# 8 bear north 68 10.088440
# 9 bear west 88 9.879103
# 10 bear east 50 8.805671
# 11 bear south 80 10.611997
# 12 bear west 92 9.782860
# 13 bear south 50 9.817243
# 14 bear west 89 10.933346
# 15 bear south 99 10.821773
# 16 bear east 91 11.392116
# 17 bear east 97 9.523826
# 18 bear north 88 10.650349
# 19 bear north 79 11.391110
# 20 bear east 79 8.889211
# 21 bear east 64 9.139207
# 22 bear north 86 8.868261
# 23 bear south 58 8.540786
# 24 lion west 68 35.239948
# 25 lion south 61 36.959613
# 26 lion north 70 38.602896
# 27 lion north 73 38.134253
# 28 lion north 91 31.990374
# 29 lion north 86 40.545446
# 30 lion east 73 32.999680
# 31 lion north 87 35.316541
# 32 lion west 93 33.733232
# 33 lion north 55 34.632949
# 34 tiger west 77 25.376386
# 35 tiger east 61 25.238322
# 36 tiger east 99 24.949815
# 37 tiger east 81 25.216145
# 38 tiger north 92 24.029130
# 39 tiger north 96 23.991566
# 40 tiger south 81 21.677802
# 41 tiger east 81 24.235333
# 42 tiger north 54 23.974699
# 43 tiger south 77 30.403782
# 44 tiger north 98 22.275768
# 45 tiger east 57 25.274512
# 46 tiger south 82 22.012750
# 47 tiger east 70 22.059129
# 48 tiger east 98 25.249405
# 49 tiger west 87 23.006722
# 50 tiger east 57 24.996355