我有100,000个文件(主要是办公室类型的文件)。我使用Excel VBA检查包含单词" list"的所有文件名,但试图避免误报(例如"专家")。
答案提供了" Regex用于匹配子字符串,但不包含word"除了我的文件名没有整齐的字边界外,它几乎是所需要的(\b(?!String)\w*ring\w*\b
)。
当前模式\b(?!specialist)\w*list\w*\b
正确地忽略了某些变体(3 Specialist
,6-specialist
,Specialists
等)。是否可以修改模式,以便正确地删除以下变体:1Specialist
,2_specialist
和Xspecialists?
如果是这样,有人可以指出我正确的方向吗?< / p>
非常感谢任何帮助/建议, 中号
这是我一直在使用的递归子程序(道歉格式不佳):
Sub RecursiveFolderPATTERN(objFolder As Scripting.Folder, _IncludeSubfolders As Boolean)
'Declare the variables
Dim objFile As Object
Dim objSubFolder As Scripting.Folder
Dim NextRow As Long
Dim objRegExp As Object
Set objRegExp = CreateObject("VBScript.RegExp")
objRegExp.Pattern = "([^A-Za-z]|^)(address|info|data)?lists?([^A-Za-z]|$)"
objRegExp.IgnoreCase = True
'Find the next available row
NextRow = Cells(Rows.Count, "A").End(xlUp).Row + 1
'Loop through each file in the folder
For Each objFile In objFolder.Files
If objRegExp.test(objFile) Then
Cells(NextRow, "A").Value = objFile.Name
Cells(NextRow, "E").Value = objFile.Size
Cells(NextRow, "F").Value = objFile.Type
Cells(NextRow, "G").Value = objFile.DateCreated
Cells(NextRow, "H").Value = objFile.DateLastAccessed
Cells(NextRow, "I").Value = objFile.DateLastModified
Cells(NextRow, "J").Value = objFile.Path
NextRow = NextRow + 1
End If
Next objFile
'Loop through files in the subfolders
If IncludeSubfolders Then
For Each objSubFolder In objFolder.Subfolders
Call RecursiveFolderPATTERN(objSubFolder, True)
Next objSubFolder
End If
End Sub
答案修改:将行If objRegExp.test(objFile) Then
更改为If objRegExp.test(objFile.Name) Then
解决了问题。
备用答案编辑:将模式从"([^A-Za-z]|^)(address|info|data)?lists?([^A-Za-z]|$)"
更改为"(^(?!.*specialist).*list.*$)"
也很有效。这两种方法都有其优点,所以我打算同时使用它们。
答案 0 :(得分:0)
这样的事情对你有用吗?
([^A-Za-z]|^)list([^A-Za-z]|$)
它匹配单词&#34; list&#34;没有被其他字母包围。
或者某些单词包含&#34; list&#34;可以接受吗?
编辑:允许匹配单词&#34; list&#34;它可以改为:
([^A-Za-z]|^)lists?([^A-Za-z]|$)
编辑2:要将某些前缀列入白名单,您可以将其更改为此(白名单&#34;地址&#34;,&#34;信息&#34;&#34;数据&#34;作为前缀用于示例目的):
([^A-Za-z]|^)(address|info|data)?lists?([^A-Za-z]|$)
答案 1 :(得分:0)
如果您的目标是找到与“列表”匹配但与“专家”不匹配的文件名,请尝试the following regex:
(?i)^(?!.*specialist).*list.*$
修改强>
从模式中删除(?i)
并使用以下代码段对其进行测试:
Sub RecursiveFolderPATTERN()
Dim objRegExp As Object, arrStrings() As String, _
i As Long, objMatch As Object
Set objRegExp = CreateObject("VBScript.RegExp")
With objRegExp
.Global = True
.IgnoreCase = True
.MultiLine = False
.Pattern = "^(?!.*specialist).*list.*$"
End With
Dim TestString As String
TestString = "3 Specialist" & vbNewLine & _
"6-specialist" & vbNewLine & _
"Specialists" & vbNewLine & _
"true SpeciaList" & vbNewLine & _
"1 Specialist" & vbNewLine & _
"2_specialist" & vbNewLine & _
"Xspecialists" & vbNewLine & _
"TheListOfSpecialists.xlsx" & vbNewLine & _
"List" & vbNewLine & _
"lISTs" & vbNewLine & _
"Globalistics" & vbNewLine & _
"GlobalList.doc" & vbNewLine & _
"fatalistic" & vbNewLine & _
"The big list of PII.csv" & vbNewLine & _
"A few lISTs with something.xls"
arrStrings = Split(TestString, vbNewLine)
For i = LBound(arrStrings) To UBound(arrStrings)
If objRegExp.Test(arrStrings(i)) Then
Debug.Print arrStrings(i)
End If
Next
End Sub