我有两个数据集,x和y。基本上是希望R扫描数据集x和数据集y中的前两列,以及是否在数据集y的前两列中找到了两个字符串,然后返回该记录和相关的第三列。
x数据集示例:
speciesA speciesB
species22 species11
species33 species44
species44 species44
...
示例y数据集:
speciesA speciesB dist
species11 species22 9
species33 species44 14
species55 species33 5
...
所需的输出:
speciesA speciesB dist
species11 species22 9
species33 species44 14
答案 0 :(得分:0)
output <- merge(x = x, y = y, by = c('speciesA', 'speciesB'), all.x = F, all.y = F)
output <- output[, c('speciesA', 'speciesB', 'dist')]) # column order
答案 1 :(得分:0)
dplyr库具有不错的加入工作流程:
library(dplyr)
x <- data.frame(speciesA = c("species11", "species33", "species44"),
speciesB = c("species22", "species44", "species44"))
y <- data.frame(speciesA = c("species11", "species33", "species55"),
speciesB = c("species22", "species44", "species33"),
dist = c(9, 14, 5))
output <- inner_join(x, y)
产生:
> output
speciesA speciesB dist
1 species11 species22 9
2 species33 species44 14
答案 2 :(得分:0)
首先,如何创建TRULY可重现的示例:
x <- data.frame(spA=c('species22','species33','species44'),
spB=c('species11','species44','species44'),
stringsAsFactors=F)
y <- data.frame(spA=c('species11','species33','species55'),
spB=c('species22','species44','species33'),
dist=c(9,14,5),
stringsAsFactors=F)
x
y
然后,该函数以字母顺序粘贴每个数据框中的两个种类,创建一个新列,然后通过此新列合并两个数据框。
pasteSorted <- function(spp) {
return(paste0(sort(spp),collapse=','))
}
x$spp <- apply(x[,1:2],1,pasteSorted)
y$spp <- apply(y[,1:2],1,pasteSorted)
x
y
z <- merge(x,y,by='spp')
最后,删除不必要的列,然后重命名其他列。
z <- z[,-(1:3)]
names(z) <- c('spA','spB','dist')
z