我有一个这样的数据框:
df <- data.frame(size_upms = c(126, 123, 148),
electric_mean = c(0.716756756756757,0.647859922178988, 0.726313694267516),
gas_mean = c(0.273513513513513,0.322679266259033, 0.259554140127389),
firewood_mean = c(0, 0.00111172873818788,0.00179140127388535))
# df
# size_upms electric_mean gas_mean firewood_mean
#1 126 0.7167568 0.2735135 0.000000000
#2 123 0.6478599 0.3226793 0.001111729
#3 148 0.7263137 0.2595541 0.001791401
我想使用mapply
为每行使用参数获取样本l <- mapply(sample,c("electric","gas","firewood"),df$size_upms,TRUE,
c(df$electric_mean,df$gas_mean,df$firewood_mean))
但是我收到了这个错误:
#Error in sample.int(length(x), size, replace, prob) :
# too few positive probabilities
但是,如果我将样本函数应用于每一行,它的工作原理如下:
sample(c("electric","gas","firewood"),df$size_upms[1],TRUE,
c(df$electric_mean[1],df$gas_mean[1],df$firewood_mean[1]))[1:5]
#[1] "gas" "electric" "electric" "gas" "electric"
sample(c("electric","gas","firewood"),df$size_upms[2],TRUE,
c(df$electric_mean[2],df$gas_mean[1],df$firewood_mean[2]))[1:5]
#[1] "electric" "gas" "gas" "gas" "electric"
sample(c("electric","gas","firewood"),df$size_upms[3],TRUE,
c(df$electric_mean[3],df$gas_mean[3],df$firewood_mean[1]))[1:5]
#[1] "electric" "electric" "gas" "electric" "electric"
但我想使用mapply,因为我想将它应用于大数据帧
我做错了什么?
答案 0 :(得分:2)
按行显示,使用apply
或lapply
更容易。 mapply
或其他apply
解决方案
lapply(seq_len(nrow(df)), function(i)
sample(c("electric","gas","firewood"), df$size_upms[i], TRUE,
unlist(c(df$electric_mean[i],df$gas_mean[i],df$firewood_mean[i]))))
OP解决方案中的错误是连接过程。在这里,我们将参数作为单独的列从数据集传递,然后在匿名函数调用中,进行连接。这将确保为每个步骤选择列中相应的行元素。
Map(function(x,y, u, w) sample(c("electric","gas","firewood"), x,
TRUE, c(y, u, w)), df$size_upms, df$electric_mean, df$gas_mean, df$firewood_mean)
或者@thelatemail评论说,我们可以使用do.call
do.call(Map, c( function(x,y, u, w)
sample(c("electric","gas","firewood"), x, TRUE, c(y,u,w)), unname(df)))