我想将R用于具有两个数据帧的基本数据库目的:第一个数据框是具有不同特征的个人列表:
data = data.frame("individual"=c("Steve","Bob","Simon","Lisa"),
"feature1"=c(1,2,2,3),
"feature2"=c(3,4,1,NA))
第二个数据框具有特征描述:
description = data.frame("feature"=c(1,2,3,4,NA),
"label"=c("foot","golf","curling","ski","No answer"))
我的目标是制作第三个数据框,其中包含个人姓名及其功能描述:
Steve foot curling
Bob golf ski
依旧......
答案 0 :(得分:4)
sqldf 尝试以下三种方式加入:
library(sqldf)
data[is.na(data)] <- "NA"
description[is.na(description)] <- "NA"
sqldf("select d1.individual, d2.label, d3.label
from data d1
left join description d2 on d1.feature1 = d2.feature
left join description d3 on d1.feature2 = d3.feature"
)
输出结果为:
individual label label
1 Simon golf foot
2 Steve foot curling
3 Bob golf ski
4 Lisa curling No answer
<强>下标强>
此解决方案假设我们已经运行了上面的两条<- "NA"
行。
labels <- with(description, setNames(label, feature))
with(data,
data.frame(individual, labels[feature1], labels[feature2], stringsAsFactors = FALSE)
)
给出输出:
individual labels.feature1. labels.feature2.
3 Steve foot curling
4 Bob golf ski
1 Simon golf foot
NA Lisa curling No answer
修订:
答案 1 :(得分:2)
对于此任务,可以使用match
。
cbind(data[1], as.data.frame(lapply(data[-1], function(x)
description$label[match(x, description$feature)])))
individual feature1 feature2
1 Steve foot curling
2 Bob golf ski
3 Simon golf foot
4 Lisa curling No answer
答案 2 :(得分:0)
使用plyr
和reshape2
require(reshape2)
require(plyr)
dcast(join(melt(data, id = "individual", value.name = "feature"), description),
individual ~ variable, value.var = "label")
individual feature1 feature2
1 Bob golf ski
2 Lisa curling No answer
3 Simon golf foot
4 Steve foot curling