从一个位置导入到另一位置的打印标题

时间:2019-06-22 19:36:47

标签: excel vba web-scraping queryselector

我创建了一个vba脚本来解析title个不同帖子以及来自网站的editing status个帖子。我现在想做的是让我的脚本从其着陆页解析title,但在打印title的同时打印editing status。我不希望为该任务创建两个子。我什至不知道在vba中是否有可能。但是,如果不清楚,请查看我脚本中的注释。

Sub ImportTitleFromAnotherLocation()
    Const LINK$ = "https://stackoverflow.com/questions/tagged/web-scraping"
    Const prefix$ = "https://stackoverflow.com"
    Dim Http As New XMLHTTP60, Html As New HTMLDocument
    Dim editInfo As Object, I&, targetUrl$, postTile$

    With Http
        .Open "GET", LINK, False
        .send
        Html.body.innerHTML = .responseText
    End With

    With Html.querySelectorAll(".summary .question-hyperlink")
        For I = 0 To .Length - 1

            postTitle = .item(I).innerText 'I like this line to be transferred to the location below

            targetUrl = Replace(.item(I).getAttribute("href"), "about:", prefix)
            With Http
                .Open "GET", targetUrl, False
                .send
                Html.body.innerHTML = .responseText
            End With

            R = R + 1: Cells(R, 1) = postTitle 'here I wish to use the above line like this

            Set editInfo = Html.querySelector(".user-action-time > a")
            If Not editInfo Is Nothing Then
                Cells(R, 2) = editInfo.innerText
            End If
        Next I
    End With
End Sub

1 个答案:

答案 0 :(得分:1)

您正在循环中覆盖html文档。一种简单的方法是使用第二个htmldocument变量。一种更详细的方法是在循环之前存储标题,例如在附加循环中将其存储在数组中,然后使用i变量将其索引到其中以在现有循环中检索每个标题。

Sub ImportTitleFromAnotherLocation()
    Const LINK$ = "https://stackoverflow.com/questions/tagged/web-scraping"
    Const prefix$ = "https://stackoverflow.com"
    Dim Http As New XMLHTTP60, Html As New HTMLDocument, Html2 As New HTMLDocument

    Dim editInfo As Object, I&, targetUrl$, postTile$
    Dim postTitle As String, r As Long
    With Http
        .Open "GET", LINK, False
        .send
        Html.body.innerHTML = .responseText
    End With

    With Html.querySelectorAll(".summary .question-hyperlink")
        For I = 0 To .Length - 1
            postTitle = .item(I).innerText 'I like this line to be transferred to the location below
            targetUrl = Replace$(.item(I).getAttribute("href"), "about:", prefix)

            With Http
                .Open "GET", targetUrl, False
                .send
                Html2.body.innerHTML = .responseText
            End With

            r = r + 1: ActiveSheet.Cells(r, 1) = postTitle 'here I wish to use the above line like this

            Set editInfo = Html2.querySelector(".user-action-time > a")
            If Not editInfo Is Nothing Then
                ActiveSheet.Cells(r, 2) = editInfo.innerText
            End If
        Next I
    End With
End Sub