我在连接三个数据帧时遇到问题。我的第一个数据框如下所示:
id <- c('123','456','789','433','234')
article1 <- c('111', '222', '333','345','443')
article2 <- c('111', '333', '223','987','230')
article3 <- c('234', '552', '897','543','098')
article4 <- c('231', '322', '341','313','099')
article5 <- c('242', '222', '222','987','443')
df1 <- data.frame(id, article1,article2,article3,article4,article5)
df1
id article1 article2 article3 article4 article5
1 123 111 111 234 231 242
2 456 222 333 552 322 222
3 789 333 223 897 341 222
4 433 345 987 543 313 987
5 234 443 230 098 099 443
现在,我有了第二个df,其中包含更多ID列信息。此df有几行用于ID。例如:
id <- c('123','123','789','433','789')
firstname <-c('Paul','Peter', 'Andi', 'Tim', 'Claire')
lastname <-c('P','D', 'A', 'T', 'C')
features <-c('AAB', 'AAC','BBD', 'CCD', 'CDC')
df2 <- data.frame(id, firstname, lastname, features)
df2
id firstname lastname features
1 123 Paul P AAB
2 123 Peter D AAC
3 789 Andi A BBD
4 433 Tim T CCD
5 789 Claire C CDC
第三个数据框如下所示,并提供有关文章的信息:
articlenumber <- c('111', '222', '333','443','345','223','234','552')
info <- c('ABC', 'CEF', 'DEF', 'FFF', 'FFD','CCF','LLK','LKO')
df3 <- data.frame(articlenumber, info)
df3
articlenumber info
1 111 ABC
2 222 CEF
3 333 DEF
4 443 FFF
5 345 FFD
6 223 CCF
7 234 LLK
8 552 LKO
最终结果应如下所示:
id article1 info article2 info article3 info article4 info article5 info firstname lastname features
1 123 111 ABC 111 ABC 234 LLK 333 DEF 222 CEF Paul P AAB
2 123 111 ABC 111 ABC 234 LLK 333 DEF 222 CEF Peter D AAC
3 456 222 CEF 333 DEF 552 LKO 111 ABC 222 CEF Andi A BBD
4 789 333 DEF 223 CCF 552 LKO 333 DEF 222 CEF Claire C CDK
对不起,我的表格格式不正确。我希望你明白我想要什么?如果一个以上的人,该行也应该出现不止一次。我已经尝试过合并和联接,但是没有得到结果。
编辑:
使用reduce可以合并df1和df2:
Reduce(function(x,y) merge(x,y,by="id",all=TRUE) ,list(df1,df2))
id article1 article2 article3 article4 article5 firstname lastname features
1 123 111 111 234 231 242 Paul P AAB
2 123 111 111 234 231 242 Peter D AAC
3 234 443 230 098 099 443 <NA> <NA> <NA>
4 433 345 987 543 313 987 Tim T CCD
5 456 222 333 552 322 222 <NA> <NA> <NA>
6 789 333 223 897 341 222 Andi A BBD
7 789 333 223 897 341 222 Claire C CDC
那么如何将df3中的articleinfo放入该df中?
答案 0 :(得分:1)
您可以像这样从left_join
包中使用dplyr
:请注意,首先我用stringsAsFactors = F
定义了data.frames。否则无法像这样加入他们。
df1 <- data.frame(id = c('123','456','789','433','234'), article1,article2,article3,article4,article5, stringsAsFactors = F)
df2 <- data.frame(id = c('123','123','789','433','789'), firstname, lastname, features, stringsAsFactors = F)
df3 <- data.frame(articlenumber, info, stringsAsFactors = F)
df1 %>% left_join(df2, by = "id") %>%
left_join(df3 %>% rename(info1 = info), by = c("article1" = "articlenumber")) %>%
left_join(df3 %>% rename(info2 = info), by = c("article2" = "articlenumber")) %>%
left_join(df3 %>% rename(info3 = info), by = c("article3" = "articlenumber")) %>%
left_join(df3 %>% rename(info4 = info), by = c("article4" = "articlenumber")) %>%
left_join(df3 %>% rename(info5 = info), by = c("article5" = "articlenumber")) %>%
select(id, article1, info1, article2, info2, article3, info3, article4, info4,
article5, info5, everything())
id article1 info1 article2 info2 article3 info3 article4 info4 article5 info5 firstname lastname features
1 123 111 ABC 111 ABC 234 LLK 231 <NA> 242 <NA> Paul P AAB
2 123 111 ABC 111 ABC 234 LLK 231 <NA> 242 <NA> Peter D AAC
3 456 222 CEF 333 DEF 552 LKO 322 <NA> 222 CEF <NA> <NA> <NA>
4 789 333 DEF 223 CCF 897 <NA> 341 <NA> 222 CEF Andi A BBD
5 789 333 DEF 223 CCF 897 <NA> 341 <NA> 222 CEF Claire C CDC
6 433 345 FFD 987 <NA> 543 <NA> 313 <NA> 987 <NA> Tim T CCD
7 234 443 FFF 230 <NA> 098 <NA> 099 <NA> 443 FFF <NA> <NA> <NA>