我应该做的事情应该很容易,但是,作为我的新手,我花了太多时间试图实现这一目标。使用此脚本,我尝试从包含任何上述模式的数据框中过滤掉所有观察值。
脚本为:
df1 <- filter_at(df, vars(contains("Pair")),
any_vars(str_detect(., pattern="quinoaquinoa|lupinelupine", negate=TRUE)))
运行此命令时,我没有收到任何错误,但是没有任何变化,并且表达式没有从数据框中取出。据我了解,这些功能也可以在!
前面放置str_detect
而不是negate=TRUE
,但是两者都不起作用。
请注意,数据框实际上更大(除了包含“ Pair”的列以外,还有其他列,并且要过滤出的模式总是不同的,并且是从另一个数据框中检索的。
数据框如下:
str(df)
'data.frame': 653 obs. of 6 variables:
$ Pair_1: Factor w/ 7 levels "grasscloverleycamelina",..: 3 7 7 3 3 3 7 6 6 6 ...
$ Pair_2: Factor w/ 20 levels "camelinacamelina",..: 10 6 6 8 8 10 6 8 8 10 ...
$ Pair_3: Factor w/ 20 levels "camelinacamelina",..: 19 20 20 20 19 19 20 20 20 16 ...
$ Pair_4: Factor w/ 23 levels "camelinacamelina",..: 9 8 8 8 9 9 4 1 1 5 ...
$ Pair_5: Factor w/ 20 levels "camelinacamelina",..: 9 12 16 16 13 13 12 12 11 11 ...
$ Pair_6: Factor w/ 20 levels "camelinacamelina",..: 20 13 9 17 20 20 5 7 8 8 ...
dput
数据框:
structure(list(Pair_1 = structure(c(3L, 7L, 7L, 3L, 3L, 3L), .Label = c("grasscloverleycamelina",
"grasscloverleyquinoa", "lupinecamelina", "lupinegrasscloverley",
"lupinelupine", "lupinequinoa", "lupinespringcereal"), class = "factor"),
Pair_2 = structure(c(10L, 6L, 6L, 8L, 8L, 10L), .Label = c("camelinacamelina",
"camelinagrasscloverley", "camelinalupine", "camelinaquinoa",
"camelinaspringcereal", "grasscloverleycamelina", "grasscloverleygrasscloverley",
"grasscloverleylupine", "grasscloverleyquinoa", "grasscloverleyspringcereal",
"quinoacamelina", "quinoagrasscloverley", "quinoalupine",
"quinoaquinoa", "quinoaspringcereal", "springcerealcamelina",
"springcerealgrasscloverley", "springcereallupine", "springcerealquinoa",
"springcerealspringcereal"), class = "factor"), Pair_3 = structure(c(19L,
20L, 20L, 20L, 19L, 19L), .Label = c("camelinacamelina",
"camelinagrasscloverley", "camelinalupine", "camelinaquinoa",
"camelinaspringcereal", "grasscloverleycamelina", "grasscloverleygrasscloverley",
"grasscloverleylupine", "grasscloverleyquinoa", "grasscloverleyspringcereal",
"quinoacamelina", "quinoagrasscloverley", "quinoalupine",
"quinoaquinoa", "quinoaspringcereal", "springcerealcamelina",
"springcerealgrasscloverley", "springcereallupine", "springcerealquinoa",
"springcerealspringcereal"), class = "factor"), Pair_4 = structure(c(9L,
8L, 8L, 8L, 9L, 9L), .Label = c("camelinacamelina", "camelinagrasscloverley",
"camelinalupine", "camelinaquinoa", "camelinaspringcereal",
"grasscloverleycamelina", "grasscloverleygrasscloverley",
"grasscloverleyquinoa", "grasscloverleyspringcereal", "lupinecamelina",
"lupinegrasscloverley", "lupinelupine", "lupinequinoa", "lupinespringcereal",
"quinoacamelina", "quinoagrasscloverley", "quinoaquinoa",
"quinoaspringcereal", "springcerealcamelina", "springcerealgrasscloverley",
"springcereallupine", "springcerealquinoa", "springcerealspringcereal"
), class = "factor"), Pair_5 = structure(c(9L, 12L, 16L,
16L, 13L, 13L), .Label = c("camelinacamelina", "camelinagrasscloverley",
"camelinaquinoa", "camelinaspringcereal", "grasscloverleycamelina",
"grasscloverleygrasscloverley", "grasscloverleyquinoa", "grasscloverleyspringcereal",
"lupinecamelina", "lupinegrasscloverley", "lupinequinoa",
"lupinespringcereal", "quinoacamelina", "quinoagrasscloverley",
"quinoaquinoa", "quinoaspringcereal", "springcerealcamelina",
"springcerealgrasscloverley", "springcerealquinoa", "springcerealspringcereal"
), class = "factor"), Pair_6 = structure(c(20L, 13L, 9L,
17L, 20L, 20L), .Label = c("camelinacamelina", "camelinagrasscloverley",
"camelinaquinoa", "camelinaspringcereal", "grasscloverleycamelina",
"grasscloverleygrasscloverley", "grasscloverleyquinoa", "grasscloverleyspringcereal",
"lupinecamelina", "lupinegrasscloverley", "lupinequinoa",
"lupinespringcereal", "quinoacamelina", "quinoagrasscloverley",
"quinoaquinoa", "quinoaspringcereal", "springcerealcamelina",
"springcerealgrasscloverley", "springcerealquinoa", "springcerealspringcereal"
), class = "factor")), row.names = c(NA, 6L), class = "data.frame")
答案 0 :(得分:0)
您的数据框中没有包含“ quinoaquinoa”或“ lupinelupine”的字符串。我认为您使用的模式不正确。这有效:filter_at(df, vars(contains("Pair")), any_vars(str_detect(., pattern = "quinoa|lupine")))
答案 1 :(得分:0)
您可以遍历数据帧中具有“对”的列,以检查是否存在所需的模式,创建逻辑矢量矩阵,然后选择不存在该模式的行。
cols <- grep("Pair", names(df))
df[rowSums(sapply(df[cols],function(x) grepl("quinoaquinoa|lupinelupine", x)))== 0, ]