我有两个数据框; DF1有3列,DF2有一列.DF1包含DF2中的所有元素,但其中大部分都是重复的,如下所示。
DF1=
***freetext***, ***specific***, ***ICDcode***
Jaundice,hepatitisA,B,C Hepatitis A B15
Jaundice,hepatitisA,B,C Hepatitis B B16
Jaundice,hepatitisA,B,C Hepatitis C B17.1
Jaundice,hepatitisA,B,C Jaundice R17
lobar Pneumonia Lobar pneumonia J18.1
Lobar Pneumonia ,scabies Lobar pneumonia J18.1
scabiess scabies G10
DF2=
Jaundice,hepatitisA,B,C
scabiess
Lobar Pneumonia ,scabies
lobar Pneumonia
我希望两个数据帧之间匹配,这样每当匹配发生时,应该有一个结果数据帧采用DF1的形式。例如黄疸,丙型肝炎,乙型肝炎,丙型肝炎应出现4次而不是在列中出现一次。换句话说,应该保留重复项,如下所示;
Resultant data frame should appear like this.
column1 column2 column3
Jaundice,hepatitisA,B,C Hepatitis A B15
Jaundice,hepatitisA,B,C Hepatitis B B16
Jaundice,hepatitisA,B,C Hepatitis C B17.1
Jaundice,hepatitisA,B,C Jaundice R17
那么,我应该如何循环DF2以在DF1(第一列)中找到匹配,然后生成与所有其他相应行匹配的数据帧,如上所示?
这是我的脚本,但它似乎没有产生我想要的结果
newMatches<- data.frame()
for(i 1:nrow(DF1){ for(j in 1:nrow(DF2[,1]{grep(j, i, ignore.case=F, value=T)->newMatches}}
#it doesn't produce other columns of DF1
任何帮助和建议都可能非常受欢迎。在R
中略显新手答案 0 :(得分:0)
据我所知,您希望过滤DF1的行,只保留DF2中第一列所存在的行。是对的吗?实现这一目标的最简单方法是
DF1[DF1[, 1] %in% DF2[, 1], ]
修改强>
以下是重现该示例的完整代码:
DF1 <- structure(list(
freetext = structure(c(1L, 1L, 1L, 1L, 2L, 3L, 4L),
.Label = c("Jaundice,hepatitisA,B,C", "lobar Pneumonia",
"Lobar Pneumonia ,scabies", "scabiess"), class = "factor"),
specific = structure(c(1L, 2L, 3L, 4L, 5L, 5L, 6L),
.Label = c("Hepatitis A", "Hepatitis B", "Hepatitis C", "Jaundice",
"Lobar pneumonia", "scabies"), class = "factor"),
ICDcode = structure(c(1L, 2L, 3L, 6L, 5L, 5L, 4L),
.Label = c("B15", "B16", "B17.1", "G10", "J18.1", "R17"),
class = "factor")),
.Names = c("freetext", "specific", "ICDcode"),
row.names = c(NA, -7L), class = "data.frame")
DF2 <- structure(list(
freetext = structure(c(1L, 4L, 3L, 2L),
.Label = c("Jaundice,hepatitisA,B,C",
"lobar Pneumonia", "Lobar Pneumonia ,scabies", "scabiess"),
class = "factor")),
.Names = "freetext", row.names = c(NA, -4L), class = "data.frame")
result <- DF1[DF1[, 1] %in% DF2[, 1], ]
打印result
会提供以下输出
freetext specific ICDcode
1 Jaundice,hepatitisA,B,C Hepatitis A B15
2 Jaundice,hepatitisA,B,C Hepatitis B B16
3 Jaundice,hepatitisA,B,C Hepatitis C B17.1
4 Jaundice,hepatitisA,B,C Jaundice R17
5 lobar Pneumonia Lobar pneumonia J18.1
6 Lobar Pneumonia ,scabies Lobar pneumonia J18.1
7 scabiess scabies G10