Question

我想从以下文本中提取文件名13572_BranchInformationReport_2012-06-28.zip -

1:30“，”/ icons / def13572_BranchInformationReport_2012-06-28.zip“，”13572_BranchInformationReport_2012-06-28.zip“，0，”184296“，”Jun 28

我正在使用的正则表达式代码是

var fileNames = from Match m in Regex.Matches(pageSource, @"[0-9]+_+[A-Za-z]+_+[0-9]+-+[0-9]+-+[0-9]+.+(acc|zip|app|xml|def|enr|exm|fpr|pnd|trm)")
                select m.Value;

应该可以正常工作。

有人能告诉我我错过了什么吗？

Answer 1

你需要逃避。在正则表达式的中间，因为。匹配任何角色。

@"[0-9]+_+[A-Za-z]+_+[0-9]+-+[0-9]+-+[0-9]+\.+(acc|zip|app|xml|def|enr|exm|fpr|pnd|trm)"

Answer 2

尝试以下RegEx：

[0-9]+_+[A-Za-z]+_+[0-9]+-+[0-9]+-+[0-9]+.+(acc|zip|app|xml|def|enr|exm|fpr|pnd|trm)(?=",")

Answer 3

您可以尝试以下正则表达式：

\d{5}_\w*_\d{4}-\d{2}-\d{2}\.(acc|zip|app|xml|def|enr|exm|fpr|pnd|trm)

这将匹配以下任何内容：

以5位数字开头
然后是下划线
然后任意数量的字母或数字
然后是下划线
然后是日期部分：4位数字，短划线，2位数字，短划线，然后是2位最终数字。
然后一段时间
最后是扩展名。

Powershell示例：

$text = '1:30","/icons/def13572_BranchInformationReport_2012-06-28.zip","13572_BranchInformationReport_2012-06-28.zip",0,"184296","Jun 28'

$regex = '\d{5}_\w*_\d{4}-\d{2}-\d{2}\.(acc|zip|app|xml|def|enr|exm|fpr|pnd|trm)'

$text -match $regex

$matches[0]

如何使用Regex提取文件名？

3 个答案: