从向量中提取不同长度的单词

时间:2019-09-10 00:36:55

标签: r

我有两种类型的结果,分别用DailyMean和Peak来区分。我想从文件名中提取出DailyMean和Peak一词。

    filenames  <- list.files(path = folder.out, pattern = 
    ls.extensions[[T]][type])
    "01611500-DailyMean.out" "01611500-Peak.out"      
    "03180500-DailyMean.out" "03180500-Peak.out"

Used substr and regexec but could only extract a fixed length
    "Dail" "Peak" "Dail" "Peak"

The result should be as follows
    "DailyMean" "Peak" "DailyMean" "Peak"

1 个答案:

答案 0 :(得分:2)

我们可以使用sub提取连字符和".out"之间的所有内容。

sub(".*-(.*)\\.out$", "\\1", x)
#[1] "DailyMean" "Peak"      "DailyMean" "Peak"  

我们还可以使用qdapRegex::ex_between来实现相同的功能而无需使用正则表达式

unlist(qdapRegex::ex_between(x, "-", ".out")) 

数据

x <- c("01611500-DailyMean.out", "01611500-Peak.out", "03180500-DailyMean.out", 
       "03180500-Peak.out")