任务:
我的目标是在我的代码模块的过程中找到所有编号行。 CodeModule.Find 方法可用于检查搜索项(目标参数)。
语法:
对象。查找(target,startline,startcol,endline,endcol [,wholeword] [,matchcase] [, patternsearch ])
推荐帮助网站https://msdn.microsoft.com/en-us/library/aa443952(v=vs.60).aspx指出: 参数 patternsearch :可选。一个布尔值,指定目标字符串是否为正则表达式模式。 如果为True,则目标字符串是正则表达式模式。 False是默认值。
如上所述,find方法允许正则表达式模式搜索,我想用它来精确地识别编号行: 数字后跟一个标签。因此,下面的示例定义了搜索字符串s,并将.Find方法中的最后一个参数PatternSearch设置为True。
问题 AFAIK有效的正则表达式定义可能是
s = "[0-9]{1,4}[ \t]"
但这并没有显示任何内容,甚至没有错误。
为了显示至少任何结果,我定义了搜索词
s = "[0-9]*[ \t]*)"
在调用示例过程 ListNumberedLines 中显示不稳定的结果。
问题
是否有可能在CodeModule.Find方法中使用有效的正则表达式模式搜索?
示例代码
Option Explicit
' ==============
' Example Search
' ==============
Sub ListNumberedLines()
' Declare search pattern string s
Dim S As String
10 S = "[0-9]*[ \t]*)"
20 Debug.Print "Search Term: " & S
30 Call findWordInModules(S)
End Sub
Public Sub findWordInModules(ByVal sSearchTerm As String)
' Purpose: find modules ('components') with lines containing a search term
' Method: .CodeModule.Find with last parameter patternsearch set to True
' Based on https://www.devhut.net/2016/02/24/vba-find-term-in-vba-modulescode/
' VBComponent requires reference to Microsoft Visual Basic for Applications Extensibility
' or keep it as is and use Late Binding instead
' Declare module variable oComponent
Dim oComponent As Object 'VBComponent
For Each oComponent In Application.VBE.ActiveVBProject.VBComponents
If oComponent.CodeModule.Find(sSearchTerm, 1, 1, -1, -1, False, False, True) = True Then
Debug.Print "Module: " & oComponent.Name 'Name of the current module in which the term was found (at least once)
'Need to execute a recursive listing of where it is found in the module since it could be found more than once
Call listLinesinModuleWhereFound(oComponent, sSearchTerm)
End If
Next oComponent
End Sub
Sub listLinesinModuleWhereFound(ByVal oComponent As Object, ByVal sSearchTerm As String)
' Purpose: list module lines containing a search term
' Method: .CodeModule.Find with last parameter patternsearch set to True
Dim lTotalNoLines As Long 'total number of lines within the module being examined
Dim lLineNo As Long 'will return the line no where the term is found
lLineNo = 1
With oComponent ' Module
lTotalNoLines = .CodeModule.CountOfLines
Do While .CodeModule.Find(sSearchTerm, lLineNo, 1, -1, -1, False, False, True) = True
Debug.Print vbTab & "Zl. " & lLineNo & "|" & _
Trim(.CodeModule.Lines(lLineNo, 1)) 'Remove any padding spaces
lLineNo = lLineNo + 1 'Restart the search at the next line looking for the next occurence
Loop
End With
End Sub
答案 0 :(得分:4)
正如@MatsMug所说,使用Regex解析VBA是 hard 是不可能的,但是行号是一个更简单的情况,应该只用regex就可以找到。
幸运的是,行号只能出现在过程体中(包括在End Sub/Function/Property
语句之前),因此我们知道它们永远不会是您代码的第一行。
不幸,您可以为行标签添加0行或更多行前缀:
Sub Foo()
_
_
10 Beep
End Sub
此外,行号后面并不总是空格 - 后面跟着一个指令分隔符,给行号看起来是行标签的外观:
Sub foo()
10: Beep
End Sub
和如果您的代码是邪恶的,您可能会遇到一个负数行号(使用十六进制表示法输入 - VBE尽职尽责地打印回具有前导空格的代码窗格负数):
Sub foo()
10 Beep
-1 Beep
End Sub
我们还需要能够确定出现在续行中的数字,不是行数:
Sub foo()
Debug.Print _
5 & "is not a line-number"
End Sub
所以,这里有一些邪恶的行编号,混合了所有这些边缘情况:
Option Explicit
Sub foo()
5: Beep
_
_
_
10 Beep
20 _
'Debug.Print _
30
50: Beep
40 Beep
_
-1 _
Beep 'The "-1" line number is achieved by entering "&HFFFFFFFF"
Debug.Print _
2 & "is not a line-number"
60 End Sub
这里有一些标识行号的正则表达式:
(?<! _)\n( _\n)* ?(?<line_number>(?:\-)?\d+)[: ]
这是regex101的语法亮点:
答案 1 :(得分:3)
在最长的时间内,Rubberduck正在努力正确/正式解析行号 - 我们的解决方法是在将代码模块内容提供给解析器之前删除它们(用空格替换它们)。
最近我们设法正式定义了行号:
// lineNumberLabel should actually be "statement-label" according to MS VBAL but they only allow lineNumberLabels:
// A <statement-label> that occurs as the first element of a <list-or-label> element has the effect
// as if the <statement-label> was replaced with a <goto-statement> containing the same
// <statement-label>. This <goto-statement> takes the place of <line-number-label> in
// <statement-list>.
listOrLabel :
lineNumberLabel (whiteSpace? COLON whiteSpace? sameLineStatement?)*
| (COLON whiteSpace?)? sameLineStatement (whiteSpace? COLON whiteSpace? sameLineStatement?)*
;
sameLineStatement : blockStmt;
lineNumberLabel
定义为:
//Statement labels can only appear at the start of a line.
statementLabelDefinition : {_input.La(-1) == NEWLINE}? (combinedLabels | identifierStatementLabel | standaloneLineNumberLabel);
identifierStatementLabel : unrestrictedIdentifier whiteSpace? COLON;
standaloneLineNumberLabel :
lineNumberLabel whiteSpace? COLON
| lineNumberLabel;
combinedLabels : lineNumberLabel whiteSpace identifierStatementLabel;
lineNumberLabel : numberLiteral;
(完整的Antlr4语法here)
注意谓词{_input.La(-1) == NEWLINE}?
,它强制解析器规则仅匹配行开头的statementLabelDefinition
- 逻辑代码行。
您会看到VBA代码具有物理代码行,就像您从CodeModule
的内容中获得的那样。但是VBA代码也有逻辑代码行的概念,结果 是解析器关心的所有内容。
这将绊倒任何典型的正则表达式:
Sub DoSomething()
Debug.Print _
42
End Sub
签名与End Sub
令牌之间只有一条逻辑代码行,但简单的Find
会很高兴地将42
视为“行号”......不是 - 它是在同一条指令中传递给Debug.Print
的参数,在同一逻辑代码行上,但是在下一个物理代码行上。
如果没有事先预处理输入,就不能处理逻辑代码行,并考虑行继续令牌。
为了做到这一点,你需要 实际解析你所看到的指示 - 至少知道它们的起点和结束位置......这不是一件小事! 请参阅ThunderFrame的回答
VBIDE API非常有限,对此没有帮助。
TL; DR:您无法单独使用正则表达式解析VBA代码。所以,不。抱歉! 你需要一个更复杂的正则表达式模式 - 请参阅ThunderFrame的答案。
答案 2 :(得分:0)
通过搜索模式查询CodeModule.Find的结论
首先,CodeModule.Find无法通过搜索模式提供帮助,其可能的用途是不透明的。 我同意VBIDE API是非常有限的,并且存在优秀的专业工具,我强烈建议任何程序员: - )
结果:通过XML解决
其次,如果可能,我更喜欢家庭补救措施,所以我试图仅使用VBIDE的有用部分找到替代解决方案。
方式强> 这就是为什么我尝试了CodeModule.Lines的简单xml对话,允许在逻辑行内进行灵活搜索。 我没有使用正则表达式来请求xml数据,而是演示了一种通过定义良好的XPath搜索(循环到节点列表)查找前导数字的方法, 从而解决了@ThunderFrame所显示的大多数问题。函数showErls中的搜索字符串定义为“line [substring(translate(。,'0123456789','¹11¹11¹'),1,1)=”¹“]”
此外,函数'lineNumber'返回模块中的逻辑行号。 注意:为了简单起见,搜索仅限于一个模块(用户定义的常量MYMODULE),代码可以避免任何正则表达式。
解决代码 - 主要
Option Explicit
' ==========================================
' User defined name of module to be analyzed
' ==========================================
Const MYMODULE = "modThunderFrame" ' << change to existing module name or userform
' Declare xml file as object
Dim xCMods As Object ' Late Binding; instead of Early Bd: Dim xCMods As MSXML2.DOMDocument6
Public Sub TestLineNumbers()
' =================
' A. Load/refresh code into xml
' =================
' set xml into memory - contains code module(s) lines
Set xCMods = CreateObject("MSXML2.Domdocument.6.0") ' L.Bd.; instead of E.Bd: Set xCMods = New MSXML2.DOMDocument60
xCMods.async = False
xCMods.validateOnParse = False
' read in user defined code module and load xml, if failed show error message
refreshCM MYMODULE
If xCMods Is Nothing Then Exit Sub
' ======================
' B. search line numbers
' ======================
showERLs
' =============================
' C. Save xml if needed
' =============================
' xCMods.Save ThisWorkbook.Path & "\VBE(" & MYMODULE & ").xml"
' MsgBox "Successfully exported Excel data to " & ThisWorkbook.Path & "\VBE(" & MYMODULE & ").XML!", _
' vbInformation, "Module " & MYMODULE & " to xml"
' =================
' D. terminate xml
' =================
Set xCMods = Nothing
End Sub
子程序
Private Sub showERLs()
' Purpose: [B.] declare XPath search string and define special translate character
Dim s As String
Dim S1 As String: S1 = Chr(185) ' superior number 1 (hex B9) replaces any digit
' declare node and node list
Dim line As Object
Dim lines As Object
' define XPath search string for first digit in line (usual case)
s = "line[substring(translate(.,'0123456789','" & String(10, S1) & "'),1,1)=""" & _
S1 & _
"""]"
' start debugging
Debug.Print "**search string=""" & s & """" & vbNewLine & String(50, "-")
Debug.Print "Line #|Line Content" & vbNewLine & String(50, "-"); ""
' set node list
Set lines = xCMods.DocumentElement.SelectNodes(s)
' -------------------
' loop thru node list
' -------------------
For Each line In lines
Debug.Print Format(lineNumber(line), "00000") & "|" & line.Text ' return logical line number plus line content
Next line
End Sub
Private Sub refreshCM(sModName As String)
' Purpose: [A.] load xml string via LoadXML method
Dim sErrTxt As String
Dim line As Object
Dim lines As Object
Dim xpe As Object
Dim s As String ' xpath expression
Dim pos As Integer ' position of line number prefix
' ======================================
' 1. Read code module lines and load xml
' ======================================
If Not xCMods.LoadXML(readCM(sModName)) Then
' set ParseError object
Set xpe = xCMods.parseError
With xpe
sErrTxt = sErrTxt & vbNewLine & String(20, "-") & vbNewLine & _
"Loading Error No " & .ErrorCode & " of xml file " & vbCrLf & _
Replace(" " & Replace(.URL, "file:///", "") & " ", " ", "[No file found]") & vbCrLf & vbCrLf & _
xpe.reason & vbCrLf & _
"Source Text: " & .srcText & vbCrLf & _
"char?: " & """" & Mid(.srcText, .linepos, 1) & """" & vbCrLf & vbCrLf & _
"Line no: " & .line & vbCrLf & _
"Line pos: " & .linepos & vbCrLf & _
"File pos.: " & .filepos & vbCrLf & vbCrLf
End With
MsgBox sErrTxt, vbExclamation, "XML Loading Error"
Set xCMods = Nothing
Exit Sub
End If
' 2. resolve hex input problem of negative line numbers with leading space (thx @Thunderframe)
s = "line"
Set lines = xCMods.DocumentElement.SelectNodes(s)
' loop thru all logical lines
For Each line In lines
pos = ErlPosInLine(line.Text)
If pos <= Len(line.Text) Then
' to do: add attribute to line node, if wanted
' correct line content
line.Text = Mid(line.Text, pos)
End If
Next
End Sub
Private Function lineNumber(node As Object) As Long
' Purpose: [B.] return logical line number within code module lines
' Param.: IXMLDomNode
' Method: XPath via preceding-sibling count plus one
Dim tag As String: tag = "line"
lineNumber = node.SelectNodes("preceding-sibling::" & tag).Length + 1
End Function
Private Function readCM(Optional modName = "*") As String
' Purpose: return code module line string (VBIDE) of a user defined module to be read into xml
' Call: called from [A.] refreshCM
' xCMods.LoadXML(readCM(sModName))
' Declare variable
Dim s As String
Dim md As CodeModule
If modName = "*" Then Exit Function
On Error GoTo OOPS
' get code module lines into string
Set md = Application.VBE.ActiveVBProject.VBComponents(modName).CodeModule ' MSAccess: Modules("modVBELines")
' change to xml tags
s = getTags(md.lines(1, md.CountOfLines))
' return
readCM = s
OOPS:
End Function
Private Function getTags(ByVal s As String, Optional mode = False) As String
' Purpose: prepares xml string to be loaded
' define constant
Const HEAD = "<?xml version=""1.0"" encoding=""utf-8""?>" & vbCrLf & "<cm>" & vbCrLf
' 1. change tag characters
s = Replace(Replace(s, "<", "<"), ">", ">")
' 2. change special characters (ampersand)
s = Replace(s, "&", "&")
' 3. change "_" points
s = Replace(s, "_" & vbCrLf, Chr(133) & vbLf)
' 4. define logical line entities
If Right(s, 2) = vbCrLf Then s = Left(s, Len(s) - 2)
s = HEAD & " <line>" & Replace(s, vbCrLf, "</line>" & vbCrLf & " <line>") & "</line>" & vbCrLf & "</cm>"
' debug xml tags if second function parameter is true (mode = True)
If mode Then Debug.Print s
' return
getTags = s
End Function
Sub testErlPosInLine()
' Purpose: Test Thunderframe's problem with ERL prefixes (underscores, " ",..) and hex inputs
Dim s As String
s = " _" & vbLf & " -1 xx"
MsgBox "|" & Mid(s, ErlPosInLine(s)) & "|" & vbNewLine & _
"prefix = |" & Mid(s, 1, ErlPosInLine(s) - 1) & "|"
End Sub
Private Function ErlPosInLine(ByVal s As String) As Integer
' Purpose: remove prefix (underscore, tab, " ",.. ) from numbered line
' cf: http://stackoverflow.com/questions/42716936/vba-to-remove-numbers-from-start-of-string-cell
Dim i As Long
For i = 1 To Len(s) ' loop each char
Select Case Mid$(s, i, 1) ' examine current char
Case " " ' permitted chars
Case "_"
Case vbLf, Chr(133), Chr(34)
Case "0" To "9": Exit For ' cut off point
Case Else: Exit For ' i is the cut off point
End Select
Next
If Mid$(s, i, 1) = "-" And Len(s) > 1 Then
If IsNumeric(Mid$(s, i + 1, 1)) Then i = i + 1
End If
' return
ErlPosInLine = i
' debug.print Mid$(s, i) '//strip lead
End Function