Question

R的新手，现在我有了一个具有相同列名的不同数据帧的列表，并且每个数据帧中都有一列我想用来过滤某些行。这是一个内部数据帧的示例。列表：

dput(df1)
df1 <- structure(list(v1 = c("A", "B", "B"),
                      t2 = c("James","[Jane] ='1' and [jane]='2'", "[john] ='1' and [john]='2' or [sly]='34'"),
                      t3 = c("James","erick", "ger'")),
                 class = "data.frame", row.names = c(NA,  -3L))

dput(df2)
df2 <- structure(list(v1 = c("B", "C", "B"),
                      t2 = c("James","[Jane]='44' or [ellen]='1' and [ellen] ='2'", "Egg"),
                      t3 = c("James","Jane", "Egg")),
                 class = "data.frame", row.names = c(NA,  -3L))
dput(df3)
df3 <- structure(list(v1 = c("d", "e", "A"),
                      t2 = c("[James] ='2' and [james]='3' or '[rady] ='44'","([rock] = '51' and  [rock] = '53') and ([roger] = '0')", "Egg"),
                      t3 = c("James","Jane", "Egg")),
                 class = "data.frame", row.names = c(NA,  -3L))

现在查看每个数据框的 t2 列，我们有一些行具有字符串组织模式，如 df1 中，我们具有类似的模式[Jane] ='1'和[jane] ='2'和 [john] ='1'和[john] ='2'或[sly] ='34' ，现在，我确实想编写一个脚本，该脚本可以遍历列表中的每个数据框，然后找到列 t2 ，并且只能过滤具有这种模式的行，但是由于这种列具有更多不同的模式，希望它只查找名称重复两次且带有的行，例如在 df1 中说，我们 [Jane] ='1再次重复'，并在它们之间加上和，因为我们有 [Jane] ='1'和[jane] ='2 '。

我的兴趣是在列 t2 中找到具有重复名称的行，但像现在在 df1 中一样，它们之间也必须有和 [Jane] ='1'和[jane] ='2'和 [john] ='1'和[john] ='2'或[sly] ='34的行'，因为名称 Jane 和 john 已经重复了两次，并且它们之间有和。

您还注意到，在 df2 中，我们可能有两个重复的名称，但是它们之间有一个 or ** [merc] ='44'或[merc] = '2'*和[lean] ='7'*，我不需要该行，我只想要重复的名称，而在它们之间使用和。

期望的输出

dput(df)
df1 <- structure(list(v1 = c( "B", "B"),
                      t2 = c("[Jane] ='1' and [jane]='2'", "[john] ='1' and [john]='2' or [sly]='34'"),
                      t3 = c("erick", "ger'")),
                 class = "data.frame", row.names = c(NA,  -3L))


dput(df2)
df2 <- structure(list(v1 = c( "B"),
                      t2 = c("[Jane]='44' or [ellen]='1' and [ellen] ='2'"),
                      t3 = c("Jane")),
                 class = "data.frame", row.names = c(NA,  -3L))


dput(df3)
df3 <- structure(list(v1 = c("d", "e"),
                      t2 = c("[James] ='2' and [james]='3' or '[rady] ='44'","([rock] = '51' and  [rock] = '53') and ([roger] = '0')"),
                      t3 = c("James","Jane")),
                 class = "data.frame", row.names = c(NA,  -3L))

如何执行此操作

根据数据框列表中的字符串模式过滤行

0 个答案: