Question

我想知道是否有人能告诉我如何从Excel - VB中的以下字符串中推断出'http://www.nbc.com/xyz'和'我喜欢这个节目'。

谢谢

<a href="http://www.nbc.com/xyz" >I love this show</a><IMG border=0 width=1 height=1 src="http://ad.linksynergy.com/fs-bin/show?id=Loe5O5QVFig&bids=261463.100016851&type=3&subid=0" >

Answer 1

Sub Tester()
    '### add a reference to "Microsoft HTML Object Library" ###
    Dim odoc As New MSHTML.HTMLDocument
    Dim el As Object
    Dim txt As String

    txt = "<a href=""http://www.nbc.com/xyz"" >I love this show</a>" & _
         "<IMG border=0 width=1 height=1 " & _
         "src=""http://ad.linksynergy.com/fs-bin/show?" & _
         "id=Loe5O5QVFig&bids=261463.100016851&type=3&subid=0"" >"

    odoc.body.innerHTML = txt

    Set el = odoc.getElementsByTagName("a")(0)
    Debug.Print el.innerText
    Debug.Print el.href

End Sub

Answer 2

一旦使用正则表达式。另一种方法是使用Split在各种分隔符上分割字符串Eg

Option Explicit

Sub splitMethod()
Dim Str As String

    Str = Sheet1.Range("A1").Value
    Debug.Print Split(Str, """")(1)
    Debug.Print Split(Split(Str, ">")(1), "</a")(0)

End Sub

Sub RegexMethod()
Dim Str As String
Dim oRegex As Object
Dim regexArr As Object
Dim rItem As Object

    'Assumes Sheet1.Range("A1").Value holds example string
    Str = Sheet1.Range("A1").Value

    Set oRegex = CreateObject("vbscript.regexp")
    With oRegex
        .Global = True
        .Pattern = "(href=""|>)(.+?)(""|</a>)"
        Set regexArr = .Execute(Str)

        'No lookbehind so replace unwanted chars
        .Pattern = "(href=""|>|""|</a>)"
        For Each rItem In regexArr
            Debug.Print .Replace(rItem, vbNullString)
        Next rItem
    End With
End Sub

'Output:
'http://www.nbc.com/xyz
'I love this show

这匹配字符串开头的href="或>，字符串末尾的"或</a>与任何字符（\ n \ n换行除外）匹配在(.+?)之间

如何操作excel中的字符串 - VBA？

2 个答案: