我想知道是否有人能告诉我如何从Excel - VB中的以下字符串中推断出'http://www.nbc.com/xyz'和'我喜欢这个节目'。
谢谢
<a href="http://www.nbc.com/xyz" >I love this show</a><IMG border=0 width=1 height=1 src="http://ad.linksynergy.com/fs-bin/show?id=Loe5O5QVFig&bids=261463.100016851&type=3&subid=0" >
答案 0 :(得分:5)
Sub Tester()
'### add a reference to "Microsoft HTML Object Library" ###
Dim odoc As New MSHTML.HTMLDocument
Dim el As Object
Dim txt As String
txt = "<a href=""http://www.nbc.com/xyz"" >I love this show</a>" & _
"<IMG border=0 width=1 height=1 " & _
"src=""http://ad.linksynergy.com/fs-bin/show?" & _
"id=Loe5O5QVFig&bids=261463.100016851&type=3&subid=0"" >"
odoc.body.innerHTML = txt
Set el = odoc.getElementsByTagName("a")(0)
Debug.Print el.innerText
Debug.Print el.href
End Sub
答案 1 :(得分:1)
一旦使用正则表达式。另一种方法是使用Split在各种分隔符上分割字符串Eg
Option Explicit
Sub splitMethod()
Dim Str As String
Str = Sheet1.Range("A1").Value
Debug.Print Split(Str, """")(1)
Debug.Print Split(Split(Str, ">")(1), "</a")(0)
End Sub
Sub RegexMethod()
Dim Str As String
Dim oRegex As Object
Dim regexArr As Object
Dim rItem As Object
'Assumes Sheet1.Range("A1").Value holds example string
Str = Sheet1.Range("A1").Value
Set oRegex = CreateObject("vbscript.regexp")
With oRegex
.Global = True
.Pattern = "(href=""|>)(.+?)(""|</a>)"
Set regexArr = .Execute(Str)
'No lookbehind so replace unwanted chars
.Pattern = "(href=""|>|""|</a>)"
For Each rItem In regexArr
Debug.Print .Replace(rItem, vbNullString)
Next rItem
End With
End Sub
'Output:
'http://www.nbc.com/xyz
'I love this show
这匹配字符串开头的href="
或>
,字符串末尾的"
或</a>
与任何字符(\ n \ n换行除外)匹配在(.+?)
之间