我有许多文件的名称末尾都带有版本号。例如:
Xxxxx V2.txt
Xxxxx V2.3.txt
Xxxxx V2.10.txt
Xxxxx V2.10.3.txt
我使用Regex提取版本号的各个部分,以便可以正确地对文件†进行排序,从而可以计算下一个版本号‡。
†例如:V2.2在V2.10之前,而V2.2在V2.2.3之前。
‡例如:V2.9之后的下一个版本是V2.10。
我可以分别处理每种样式的版本号,但不能一概而论地为所有样式创建一个Regex模式。
Text Pattern Value(s) extracted
Xxxxx V2.txt Xxxxx V(\d+)\.txt 2
Xxxxx V2.3.txt Xxxxx V(\d+)\.(\d+)\.txt 2 3
Xxxxx V2.10.3.txt Xxxxx V(\d+)\.(\d+)\.(\d+)\.txt 2 10 3
Xxxxx V2.10.3.txt Xxxxx V(\d+){\.(\d+)}*\.txt No match
我不明白为什么最后一个模式对每种样式的版本号都不起作用。任何指导表示赞赏。
新部分以回应评论
我希望Regex模式中有一个简单的错误,并且我的代码无关紧要。我整理了测试代码以创建:
Sub CtrlTestCapture()
Dim Patterns As Variant
Dim Texts As Variant
Texts = Array("Xxxxx V12.txt", _
"Xxxxx V12.3.txt", _
"Xxxxx V12.4.5.txt", _
"Xxxxx V12.4.5.3.txt")
Patterns = Array("Xxxxx V(\d+)\.txt", _
"Xxxxx V(\d+)\.(\d+)\.txt", _
"Xxxxx V(\d+)\.(\d+)\.(\d+)\.txt", _
"Xxxxx V(\d+){\.(\d+)}+\.txt", _
"Xxxxx V(\d+)(?:\.(\d+))?(?:\.(\d+))?\.txt" , _
"Xxxxx V(\d+)(\.(\d+))*\.txt")
Call TestCapture(Patterns, Texts)
End Sub
Sub TestCapture(ByRef Patterns As Variant, ByRef Texts As Variant)
Dim InxM As Long
Dim InxS As Long
Dim Matches As MatchCollection
Dim PatternCrnt As Variant
Dim RegEx As New RegExp
Dim SubMatchCrnt As Variant
Dim TextCrnt As Variant
With RegEx
.Global = True ' Find all matches
.MultiLine = False ' Match cannot extend across linebreak
.IgnoreCase = True
For Each PatternCrnt In Patterns
.Pattern = PatternCrnt
For Each TextCrnt In Texts
Debug.Print "==========================================="
Debug.Print " Pattern: """ & PatternCrnt & """"
Debug.Print " Text: """ & TextCrnt & """"
If Not .test(TextCrnt) Then
Debug.Print Space(12) & "Text does not match pattern"
Else
Set Matches = .Execute(TextCrnt)
If Matches.Count = 0 Then
Debug.Print Space(12) & "Match but no captures"
Else
For InxM = 0 To Matches.Count - 1
Debug.Print "-------------------------------------------"
With Matches(InxM)
Debug.Print " Match: " & InxM + 1
Debug.Print " Value: """ & .Value & """"
Debug.Print " Length: " & .Length
Debug.Print "FirstIndex: " & .FirstIndex
For InxS = 0 To .SubMatches.Count - 1
Debug.Print " SubMatch: " & InxS + 1 & " """ & .SubMatches(InxS) & """"
Next
End With
Next
End If
End If
Next
Next
Debug.Print "==========================================="
End With
End Sub
使用此代码,WiktorStribiżewregex模式所产生的结果要比我不整洁的代码更好。我将必须查看原始代码以查找错误。使用此代码,WiktorStribiżewregex模式的输出为:
===========================================
Pattern: "Xxxxx V(\d+)(?:\.(\d+))?(?:\.(\d+))?\.txt"
Text: "Xxxxx V12.txt"
-------------------------------------------
Match: 1
Value: "Xxxxx V12.txt"
Length: 13
FirstIndex: 0
SubMatch: 1 "12"
SubMatch: 2 ""
SubMatch: 3 ""
===========================================
Pattern: "Xxxxx V(\d+)(?:\.(\d+))?(?:\.(\d+))?\.txt"
Text: "Xxxxx V12.3.txt"
-------------------------------------------
Match: 1
Value: "Xxxxx V12.3.txt"
Length: 15
FirstIndex: 0
SubMatch: 1 "12"
SubMatch: 2 "3"
SubMatch: 3 ""
===========================================
Pattern: "Xxxxx V(\d+)(?:\.(\d+))?(?:\.(\d+))?\.txt"
Text: "Xxxxx V12.4.5.txt"
-------------------------------------------
Match: 1
Value: "Xxxxx V12.4.5.txt"
Length: 17
FirstIndex: 0
SubMatch: 1 "12"
SubMatch: 2 "4"
SubMatch: 3 "5"
===========================================
Pattern: "Xxxxx V(\d+)(?:\.(\d+))?(?:\.(\d+))?\.txt"
Text: "Xxxxx V12.4.5.3.txt"
Text does not match pattern
===========================================
这具有固定数量的捕获,而不是我尝试的可变数量。我还必须弄清楚如何将其扩展到处理“ 12.4.5.3”,这是我见过的最复杂的版本号样式。这不是完美的方法,但绝对是我当前解决方法的改进。您正在使用我不认识的正则表达式字符,因此需要仔细研究。
使用上面的代码,Tiw regex模式产生了以下输出:
===========================================
Pattern: "Xxxxx V(\d+)(\.(\d+))*\.txt"
Text: "Xxxxx V12.txt"
-------------------------------------------
Match: 1
Value: "Xxxxx V12.txt"
Length: 13
FirstIndex: 0
SubMatch: 1 "12"
SubMatch: 2 ""
SubMatch: 3 ""
===========================================
Pattern: "Xxxxx V(\d+)(\.(\d+))*\.txt"
Text: "Xxxxx V12.3.txt"
-------------------------------------------
Match: 1
Value: "Xxxxx V12.3.txt"
Length: 15
FirstIndex: 0
SubMatch: 1 "12"
SubMatch: 2 ".3"
SubMatch: 3 "3"
===========================================
Pattern: "Xxxxx V(\d+)(\.(\d+))*\.txt"
Text: "Xxxxx V12.4.5.txt"
-------------------------------------------
Match: 1
Value: "Xxxxx V12.4.5.txt"
Length: 17
FirstIndex: 0
SubMatch: 1 "12"
SubMatch: 2 ".5"
SubMatch: 3 "5"
===========================================
Pattern: "Xxxxx V(\d+)(\.(\d+))*\.txt"
Text: "Xxxxx V12.4.5.3.txt"
-------------------------------------------
Match: 1
Value: "Xxxxx V12.4.5.3.txt"
Length: 19
FirstIndex: 0
SubMatch: 1 "12"
SubMatch: 2 ".3"
SubMatch: 3 "3"
===========================================
也就是说,它似乎总是可以捕获:第一部分,包括点的最后部分,以及不带点的最后部分。很有希望,但还不够。
第3部分
我忽略了要求明确说明我寻求的结果的请求。
我在所有重要文件上使用版本号。我从其他人那里收到文件,其中包含版本号,其中一些比我的复杂得多。我始终将版本号作为文件名的最后一部分,并且在版本号之前始终带有“ V”。如果我收到的文件不符合我的格式,则我将它们重命名,以便也可以。所以我有一些文件,例如:
我希望将Ns提取到可变长度数组或集合中,以便可以使用通用例程来处理它们。实际上,我已经有了那些通用例程。这些例程依赖于提取Ns的一些凌乱的VBA代码。我以为使用Regex可以整理代码。
答案 0 :(得分:3)
尝试此正则表达式:
V(\d+(?:\.\d+)*)\.txt$
所需的版本已在组1中捕获。您可以使用.
代码:
Dim objReg, strFile, objMatches, strVersion, arrVersion
strFile = "Xxxxx V2.3.txt"
Set objReg = New RegExp
objReg.Global = True
objReg.Multiline = True
objReg.Pattern = "V(\d+(?:\.\d+)*)\.txt$"
If objReg.Test(strFile) Then
Set objMatches = objReg.Execute(strFile)
strVersion = objMatches.item(0).submatches.item(0) 'To get the full version number
arrVersion = Split(strVersion,".") 'To get each number in the version(stored in array)
End If
正则表达式说明
V(\d+(?:\.\d+)*)\.txt$
V
-匹配V
(\d+(?:\.\d+)*)
-匹配1个以上的数字。匹配了尽可能多的数字后,请匹配0个或多个出现的点.
,再加上1个以上的数字。整个匹配项在第1组中捕获,是您所需的版本号\.txt
-匹配.txt
$
-声明该行的结尾。答案 1 :(得分:1)
如果您愿意,这里是一个非正则表达式的解决方案。您可以将版本号转换为数字,然后对其进行排序。
Sub GetOrderedList()
Dim Texts As Variant
Dim FileName As String
Dim FileArrayList As Object
Dim Item As Variant
Set FileArrayList = CreateObject("System.Collections.ArrayList")
Texts = Array("Xxxxx V12.txt", _
"Xxxxx V12.3.txt", _
"Xxxxx V12.4.5.txt", _
"Xxxxx V12.4.5.3.txt")
For i = LBound(Texts) To UBound(Texts)
'You get use the FileSystemObject to make this a bit easier
FileName = Replace(Replace(Split(Texts(i), " ")(UBound(Split(Texts(i), " "))), "V", ""), ".txt", "")
PeriodPosition = InStr(1, FileName, ".")
'Convert to a number, then sort
If PeriodPosition > 0 Then FileName = Left$(FileName, PeriodPosition) & Replace(FileName, ".", "0", PeriodPosition + 1)
FileArrayList.Add FileName
Next
'Sort
FileArrayList.Sort
'Print out, ascending order
For Each Item In FileArrayList
Debug.Print Item
Next
End Sub