我正在尝试从URL列表中抓取Facebook视频标题。
我的宏适用于单个视频,其中URL已内置到代码中。我希望脚本改为遍历A列中的每个URL,并将视频标题输出到B列。有帮助吗?
当前代码:
Sub ScrapeVideoTitle()
Dim appIE As Object
Set appIE = CreateObject("internetexplorer.application")
With appIE
.navigate "https://www.facebook.com/rankertotalnerd/videos/276505496352731/"
.Visible = True
Do While appIE.Busy
DoEvents
Loop
'Add Video Title to Column B
Range("B2").Value = appIE.document.getElementsByClassName("_4ik6")(0).innerText
appIE.Quit
Set appIE = Nothing
End With
End Sub
答案 0 :(得分:1)
提供VBE>工具>引用>添加对Microsoft HTML对象库的引用,您可以执行以下操作:
将所有网址读入数组。循环数组,并使用xmlhttp发出GET
请求到页面。将响应读取到HTMLDocument
变量中,然后使用css选择器提取标题并存储在数组中。在循环结束时,一次性将所有结果写到表中。
Option Explicit
Public Sub GetTitles()
Dim urls(), ws As Worksheet, lastRow As Long, results(), i As Long, html As HTMLDocument
Set html = New HTMLDocument
Set ws = ThisWorkbook.Worksheets("Sheet1")
With ws
lastRow = .Cells(.rows.Count, "A").End(xlUp).Row
urls = Application.Transpose(.Range("A2:A" & lastRow).Value)
End With
ReDim results(1 To UBound(urls))
With CreateObject("MSXML2.XMLHTTP")
For i = LBound(urls) To UBound(urls)
If InStr(urls(i), "http") > 0 Then
.Open "GET", urls(i), False
.send
html.body.innerHTML = .responseText
results(i) = html.querySelector(".uiHeaderTitle span").innerText
End If
Next
End With
ws.Cells(2, 2).Resize(UBound(results), 1) = Application.Transpose(results)
End Sub
css选择器与页面的匹配:
答案 1 :(得分:0)
如果您的网址中包含“ 276505496352731”部分,或者实际上是A列中的整个网址,则可以将范围设置为最高值,然后循环直到该范围为空,每次刮刮时将其向下移动一次。
类似的东西:
'Dims as before
Dim r as range
With appIE
set r = Range("B1") ' Assumes B1 is the top of the URL list
do while r.value > ""
.navigate r.value
'Do the rest of your IE stuff
r.offset(0,1).Value = appIE.document.getElementsByClassName("_4ik6")(0).innerText
set r = r.offset(1)
Loop
End With
那应该会有所帮助。