Question

我已经使用vba结合IE编写了一个脚本，以解析应用了 regex 的网页中的联系信息。我进行了很多搜索，但找不到任何可以满足我要求的示例。 pattern可能不是找到phone编号的理想选择，但这里的主要问题是我如何在vba IE中使用pattern。

再次：我的目的是在vba IE中应用661-421-5861来解析该网页中的电话号码regex。

这是我到目前为止尝试过的：

Sub FetchItems()
    Const URL$ = "https://www.nafe.com/bakersfield-nafe-network"
    Dim IE As New InternetExplorer, HTML As HTMLDocument
    Dim rxp As New RegExp, email As Object, Row&

    With IE
        .Visible = True
        .navigate URL
        While .Busy = True Or .readyState < 4: DoEvents: Wend
        Set HTML = .document
    End With

    With rxp
        .Pattern = "(?<=Phone:)\s*?.*?([^\s]+)"
        Set email = .Execute(HTML.body.innerText) 'I'm getting here an error
        If email.Count > 0 Then
            Row = Row + 1: Cells(Row, 1) = email.Item(0)
        End If
    End With
    IE.Quit
End Sub

执行上述脚本时，遇到包含Set email = .Execute(HTML.body.innerText)的行时，对象“ IRegExp2”的方法“执行”失败。我怎样才能成功？

Answer 1

请注意，VBA正则表达式不支持lookbehinds。在这里，您可能想捕获Phone:之后的任意数字，后跟任意数量的数字和连字符。

您需要将模式重新定义为

rxp.Pattern = "Phone:\s*(\d[-\d]+)"

然后，您需要获取第一场比赛并访问其.SubMatches(0)：

Set email = .Execute(HTML.body.innerText)
If email.Count > 0 Then
    Cells(Row+1, 1) = email.Item(0).SubMatches(0)
 End If

请参见regex in action。 .SubMatches(0)保留的是ing的绿色突出显示部分。

模式详细信息

Phone:-文字子字符串
\s*-超过0个空格
(\d[-\d]+)-捕获组1：一个数字，后跟1+（由于+，您可以替换为*以匹配零个或多个数字）和/或连字符。

Answer 2

这是使用xmlhttp对象的一种更快的方法

Sub FetchItems()
   Dim URL As String, strBody As String
   Dim intS As Long, intE As Long

    URL = "https://www.nafe.com/bakersfield-nafe-network"

    Dim xml As Object
    Set xml = CreateObject("MSXML2.XMLHTTP")
    xml.Open "GET", URL, False
    xml.send

    Dim html As Object
    Set html = CreateObject("htmlfile")
    html.body.innerHTML = xml.responseText

    strBody = html.body.innerHTML

    intS = InStr(1, strBody, "Phone:", vbTextCompare) + Len("Phone:")
    intE = InStr(intS, strBody, "<", vbTextCompare)

    MsgBox Mid(strBody, intS, intE - intS)
End Sub

无法在vba IE

2 个答案: