上下文:使用dplyr
和filter
,排除(Windows)文件名条目的版本,表明它正在使用中,但保留该文件名的未修改版本。< / p>
我想:
"__MATCH__9999.xlsx"
结尾的条目,其中9999
可以是任意数量的随机整数。输入:注意前两个条目引用相同的文件
fl=tibble(fn=c("C:/a/b/c/~$a__01__IQ9__FQ__MATCH__4567.xlsx",
"C:/a/b/c/a__01__IQ9__FQ__MATCH__4567.xlsx",
"C:/a/b/c/a__01__IQ2__FQ__NOTMATCH__8910.xlsx"))
fl %>%
filter(grepl("regexp",fn))
期望的结果:
"C:/a/b/c/a__01__IQ9__FQ__MATCH__4567.xlsx"
部分/黑客我不确定如何将这两个步骤缩减为一个......
> fl %>%
filter( grepl("(__MATCH__[\\d]+\\.xlsx$)",fn,perl=TRUE) ) %>%
filter( !grepl("\\$",fn,perl=TRUE) )
# A tibble: 1 x 1
fn
<chr>
1 C:/a/b/c/a__01__IQ9__FQ__MATCH__4567.xlsx
答案 0 :(得分:1)
Enabling perl
as default engine, you are able to work with lookaheads:
fl %>%
filter(grepl("^(?!.*/~\\$).*__MATCH__\\d+\\.xlsx$",fn, ignore.case = FALSE, perl = TRUE))
# A tibble: 1 x 1
fn
<chr>
1 C:/a/b/c/a__01__IQ9__FQ__MATCH__4567.xlsx
Breakdown:
^
Assert beginning of input string(?!.*/~\\$)
Shouldn't contain /~$
.*__MATCH__\\d+\\.xlsx
Match this literal$
That occurs at the end