是一些伪数据。
假设我有一个数据框
df = data.frame(source = c("X1", "X2", "X3", "X4", "X5", "X6", "X7", "X8", "X9", "X10",
"X11", "X12", "X13", "X14", "X15", "X16", "X17", "X18", "X19", "X110"),
Destination = c("X3","X5","X17", "X20", "X20","X1", "X2", "X3", "X7", "X10",
"X13","X15","X7", "X1", "X20","X17", "X2", "X3", "X7", "X10"),
weight = seq(1,1.95,by=0.05))
然后我有Destinations X1:X3
的几率,并有各自的标准差,我想从每个几率及其对应的标准差中随机抽样10次
OR_dat <- c(1.55,1.39,1.77)
sds <- c(0.2925175, 0.4775346, 0.1603566)
n <- 10
normv <- function( n , mean , sd ){
out <- rnorm( n*length(mean) , mean = mean , sd = sd )
return( matrix( out , ncol = n , byrow = FALSE ))
}
RR_neighbour_1 <- data.frame(t(normv(n, OR_dat , sds )))
colnames(RR_neighbour_1) <- c("X1", "X2", "X3")
我真正要寻找的是通过查看标题为"Destination"
的列中的值,将该矩阵与data.frame合并,并将其与标题为RR_neighbour_1
的矩阵的列名进行匹配,然后然后创建其他行以输入分布。然后,输出应如下所示:
答案 0 :(得分:1)
您实际要做的是通过Destination
合并两个data.frame。因此,您首先需要将第二个data.frame(RR_neighbour_1
)转换为长格式(与第一个相同的格式,不同的目标是行而不是列)。然后,您可以简单地使用merge
函数合并data.frames。参数all=T
将确保添加重复目的地的行。
RR_neighbour_1 <- reshape(RR_neighbour_1,dir="long",varying = list(1:3),
timevar = "Destination",
times = colnames(RR_neighbour_1),
v.names = "RR_neighbour_1")
merge(df, RR_neighbour_1[,-3], all=T)
答案 1 :(得分:0)
一种可能性:如果您愿意使用dplyr
软件包,则它包含SQL样式的联接函数。您可能需要该软件包中的left_join
函数,该函数可让您使用by
参数映射列。这是连接两个类似表格的结构的简单方法。