Excel VBA-如何提取带有特定文本的URL?

时间:2018-07-19 06:47:13

标签: vba excel-vba

我需要提取包含URL的特定行的URL,例如 / example / example1 / newexample /

<a onfocus="OnLink(this)" href="/example/example1/newexample/testing.aspx">Testing</a>

我当前的代码返回页面上的所有超链接。我如何仅使用 / example / example1 / newexample /

提取这些链接
Sub GetAllLinks()
    Dim IE As Object
    Set IE = CreateObject("InternetExplorer.Application")
    url_name = Sheet1.Range("B2")
    If url_name = "" Then Exit Sub
    IE.navigate (url_name)
    Do
        DoEvents
    Loop Until IE.readyState = READYSTATE_COMPLETE
    Set AllHyperlinks = IE.document.getElementsByTagName("A")
    Sheet1.ListBox1.Clear
    For Each Hyperlink In AllHyperlinks
        Sheet1.ListBox1.AddItem (Hyperlink)
    Next
    IE.Quit
    MsgBox "Completed"
End Sub

3 个答案:

答案 0 :(得分:0)

使用:

Sub GetAllLinks(byval filter as string)
  Dim IE As Object
  Dim url_name As String
  Set IE = CreateObject("InternetExplorer.Application")
  url_name = Sheet1.Range("B2")
  If url_name = "" Then Exit Sub
  IE.navigate (url_name)
  Do
    DoEvents
  Loop Until IE.ReadyState = READYSTATE_COMPLETE
  Set AllHyperlinks = IE.Document.getElementsByTagName("A")
  Sheet1.ListBox1.Clear
  For Each Hyperlink In AllHyperlinks
    If InStr(Hyperlink.href, filter) > 0 Then
      Sheet1.ListBox1.AddItem (Hyperlink)
    end if
  Next
  IE.Quit
  MsgBox "Completed"
End Sub

过滤器会过滤您的字符串。所以像这样使用它:

call GetAllLinks("/example/example1/newexample/")

顺便说一句,使用

Option Explicit

并定义变量。

答案 1 :(得分:0)

见下文

For Each Hyperlink In AllHyperlinks
    If InStr(1, Hyperlink, "/example/example1/newexample/", vbTextCompare) > 0 Then
      Sheet1.ListBox1.AddItem (Hyperlink)
    End If
Next

此外,您需要更改

Loop Until IE.readyState = READYSTATE_COMPLETELoop Until IE.readyState <> READYSTATE_COMPLETE

答案 2 :(得分:0)

使用CSS选择器更容易避免初始循环和目标,然后循环返回的nodeList

Dim aNodeList As Object,  i As Long
Set aNodeList = IE.document.querySelector("a[href^='/example/example1/newexample/']")

For i = 0 To aNodeList.Length-1
   Debug.print aNodeList.item(i).getAttribute("href")
Next i

^的意思是从a[href^='/example/example1/newexample/']开始,所以a在寻找带有href标签的元素,该标签包含从'/example/example1/newexample/'开始的属性options

这是对您的html示例起作用的CSS选择器:

CSS query