我正在尝试使用powershell脚本和正则表达式从字符串中获取子字符串。
例如,我试图让一年成为文件名的一部分。
示例文件名“Expo.2000.Brazilian.Pavillon.after.Something.2016.SomeTextIDontNeed.jpg” 问题是正则表达式的结果给了我“2000”而没有其他匹配。我需要让“2016”匹配。可悲的是,$ match只有一个匹配的实例。我错过了什么吗?我感到疯了;)
如果$ matches包含找到的所有实例,我可以处理最近的实例:
$Year = $matches[$matches.Count-1]
Powershell代码:
# Function to get the images year and clean up image information after it.
Function Remove-String-Behind-Year
{
param
(
[string]$OriginalFileName # Provide the BaseName of the image file.
)
[Regex]$RegExYear = [Regex]"(?<=\.)\d{4}(?=\.|$)" Regex to match a four digit string, prepended by a dot and followed by a dot or the end of the string.
$OriginalFileName -match $RegExYear # Matches the Original Filename with the Regex
Write-Host "Count: " $matches.Count # Why I only get 1 result?
Write-Host "BLA: " $matches[0] # First and only match is "2000"
}
通缉结果表:
"x.2000.y.2016.z" => "2016" (Does not work)
"x.y.2016" => "2016" (Works)
"x.y.2016.z" => "2016" (Works)
"x.y.20164.z" => "" (Works)
"x.y.201.z" => "" (Works)
答案 0 :(得分:0)
-match
运算符只能找到(最多)一个匹配(尽管可以通过捕获找到该一个匹配项的多个子字符串基团)。*
贪婪(默认情况下)这一事实,我们仍然可以使用该匹配来查找 last 匹配输入:-match '^.*\.(\d{4})\b'
找到输入的最长前缀,该前缀以4位数字序列结尾,前面是文字.
,后跟字边界,因此$matches[1]
然后包含 last 出现的这种4位数序列。Function Extract-Year
{
param
(
[string] $OriginalFileName # Provide the BaseName of the image file.
)
if ($OriginalFileName -match '^.*\.(\d{4})\b') {
$matches[1] # output last 4-digit sequence found
} else {
'' # output empty string to indicate that no 4-digit sequence was found.
}
}
'x.2000.y.2016.z', 'x.y.2016', 'x.y.2016.z', 'x.y.20164.z', 'x.y.201.z' |
% { Extract-Year $_ }
产量
2016
2016
2016
# empty line
# empty line