我需要声明一个字符串以用作正则表达式模式。
字符串是: (小于?=" [A-ZA-Z0-9 .-] * \ d {8}的.xml(="?)
通常在VBA中声明一个用于Reg Exp的字符串,用双引号括起来,所以它看起来像这样: "(小于?=" [A-ZA-Z0-9 .-] * \ d {8}的.xml(="?)" 但这会导致VBA编译错误:预期:突出显示[a-zA-Z0-9.-]的语句结束。
此: "(小于?=""" [A-ZA-Z0-9 .-] * \ d {8}的.xml(="&#34 ;")" 导致同样的错误。
此 "(小于?="""" [A-ZA-Z0-9 .-] * \ d {8}的.xml(=&#34 ;""")"
有效,但当我使用Msgbox查看模式时,它显示如下:
(小于?="" [A-ZA-Z0-9 .-] * \ d {8}的.xml(="&#34)
因此在RegEx中无法正常工作。
Arghhhh!
以下是我用于测试的代码:
Sub tester()
Dim PATH_TO_FILINGS As String
'PATH_TO_FILINGS = "www.sec.gov/Archives/edgar/data/1084869/000110465913082760"
PATH_TO_FILINGS = "www.sec.gov/Archives/edgar/data/1446896/000144689612000023"
MsgBox GetInstanceDocumentPath(PATH_TO_FILINGS)
End Sub
Function GetInstanceDocumentPath(PATH_TO_FILINGS As String)
'this part launches IE and goes to the correct directory
If IEbrowser Is Nothing Then
Set IEbrowser = CreateObject("InternetExplorer.application")
IEbrowser.Visible = False
End If
IEbrowser.Navigate URL:=PATH_TO_FILINGS
While IEbrowser.Busy Or IEbrowser.readyState <> 4: DoEvents: Wend
'this part starts the regular expression engine and searches for the reg exp pattern (i.e. the file name)
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
RE.Pattern = "(?<="[a-zA-Z0-9.-]*\d{8}.xml(?=")" '"\w+(?=-)(-)\d{8}(.xml)"
MsgBox RE.Pattern
RE.IgnoreCase = True
Dim INSTANCEDOCUMENT As Object
Set INSTANCEDOCUMENT = RE.Execute(IEbrowser.Document.body.innerhtml)
If INSTANCEDOCUMENT.Count = 1 Then
GetInstanceDocumentPath = PATH_TO_FILINGS & "/" & INSTANCEDOCUMENT.Item(0)
End If
End Function
对于如何处理此问题的任何想法都表示赞赏。
答案 0 :(得分:3)
尝试这样做:
Sub Test()
RealQ = Chr(34)
Pattern = "(?<=" & RealQ & ")[a-zA-Z0-9.-]*\d{8}.xml(?=" & RealQ & ")"
MsgBox Pattern
End Sub
结果:
此外,VBA不支持lookbehind但它确实支持lookahead。可以找到更好的参考here。