filelist <- c(
"http://content.caiso.com/green/renewrpt/20171015_DailyRenewablesWatch.txt",
"http://content.caiso.com/green/renewrpt/20171016_DailyRenewablesWatch.txt",
"http://content.caiso.com/green/renewrpt/20171017_DailyRenewablesWatch.txt",
"http://content.caiso.com/green/renewrpt/20171018_DailyRenewablesWatch.txt",
"http://content.caiso.com/green/renewrpt/20171019_DailyRenewablesWatch.txt",
"http://content.caiso.com/green/renewrpt/20171020_DailyRenewablesWatch.txt",
"http://content.caiso.com/green/renewrpt/20171021_DailyRenewablesWatch.txt",
"http://content.caiso.com/green/renewrpt/20171022_DailyRenewablesWatch.txt"
)
我希望在/
和_
的第5次出现之间提取字符串
例如:来自"http://content.caiso.com/green/renewrpt/20171015_DailyRenewablesWatch.txt"
我想要20171015
。
我试过了
regmatches(filelist, regexpr("/{4}([^_]+)", filelist))
但它返回空。
答案 0 :(得分:4)
这应该有效
gsub("(?:.*/){4}([^_]+)_.*", "\\1", filelist)
# [1] "20171015" "20171016" "20171017" "20171018" "20171019" "20171020" "20171021"
# [8] "20171022"
我们还需要匹配捕获中每个被削减前面的东西。
答案 1 :(得分:1)
以下是一些使用正则表达式的方法:
sub(".*(\\d{8}).*", "\\1", filelist)
sub(".*/", "", sub("_.*", "", filelist))
sub("_.*", "", basename(filelist))
sapply(strsplit(filelist, "[/_]"), "[", 6)
gsub("\\D", "", filelist)
m <- gregexpr("\\d{8}", filelist)
unlist(regmatches(filelist, m))
strcapture("(\\d{8})", filelist, data.frame(character()))[[1]]
library(gsubfn)
strapplyc(filelist, "\\d{8}", simplify = TRUE)
这些解决方案根本不使用正则表达式:
substring(filelist, 41, 48)
substring(basename(filelist), 1, 8)
read.table(text = filelist, comment.char = "_", sep = "/")[[6]]
as.Date(basename(filelist), "%Y%m%d") # returns Date class object
更新:添加了更多方法。
答案 2 :(得分:0)
substr(x = filelist,
start = sapply(gregexpr(pattern = "/", filelist), function(x) x[5])+1,
stop = sapply(gregexpr(pattern = "_", filelist), function(x) x[1])-1)
#[1] "20171015" "20171016" "20171017" "20171018" "20171019" "20171020" "20171021"
#[8] "20171022"
答案 3 :(得分:0)
有个函数可以先去掉url:
CREATE TRIGGER `lakcom_NewDB`.`Products_AFTER_UPDATE`
AFTER UPDATE ON `Products`
FOR EACH ROW
BEGIN
update Basket
set Basket.ToControl = 1
where Basket.Product_id=new.product_id
and Basket.id<>0;
END
然后尝试使用 filelist <- basename(filelist)
包中的 str_remove
删除“_”之后的所有内容:
stringr
输出:
<块引用>[1] "20171015" "20171016" "20171017" "20171018" "20171019" "20171020" "20171021" "20171022"
如果您想将其转换为日期,请检查 library(stringr)
str_remove(filelist, "_.*")
包的 lubridate
函数。