如何计算加载URL的延迟并使用VBA重新开始加载行中的下一个URL

时间:2018-12-19 07:22:49

标签: html excel vba excel-vba web-scraping

我在Excel列中列出了50个URL链接。我提取了数据,但是由于网页上有高清照片,我认为URL有时会超时。这是链接之一

https://www.wavemotion.gr/el/shop/smartphone-accessories/itap-magnetic-air-vent-car-mount

如果加载延迟超过10秒,如何转到下一个URL?到目前为止,我有以下内容

'Set Worksheet
Set wks = wb.Sheets("wavemotion")
'Limit rows
lastrow = wks.Cells(Rows.Count, "B").End(xlUp).Row
'Set IE display
ie.Visible = True


For i = 2 To lastrow

mylink = wks.Cells(i, 2).Value

ie.Navigate mylink

Const MAX_WAIT_SEC As Long = 5


While ie.Busy Or ie.ReadyState < 4: DoEvents: Wend
t = Timer
Do
    DoEvents
    On Error Resume Next
'change row color           
    wks.Range(Cells(i, 1), Cells(i, 5)).Interior.ColorIndex = 38

    Set product_sku = ie.Document.querySelector(".single-product__sku")
    wks.Cells(i, "A").Value = product_sku.innerText

    Set price = ie.Document.querySelector(".price .woocommerce-Price-amount")
    wks.Cells(i, "E").Value = price.innerText

    Set availability = ie.Document.querySelector(".stock.in-stock ")
    Set availability = ie.Document.querySelector(".stock.out-of-stock ")
    Set availability = ie.Document.querySelector(".stock.out-of-stock ")

    wks.Cells(i, "D").Value = availability.innerText

    Set product_name = ie.Document.querySelector(".single-product__title")
    wks.Cells(i, "C").Value = product_name.innerText

    'Set product_color = ie.Document.querySelector(".single-product__colors__label ")
    'wks.Cells(i, "G").Value = product_color.innerText


    If Timer - t > MAX_WAIT_SEC Then Exit Do
    On Error GoTo 0
Loop
If price Is Nothing Then Exit Sub

wks.Range(Cells(i, 1), Cells(i, 5)).Interior.ColorIndex = 0

Next i

ie.Quit
Set ie = Nothing

2 个答案:

答案 0 :(得分:1)

我建议您更改页面是否已加载的检查。在我的代码中,它乘以加载所需的时间乘以10秒后退出循环。

我添加了一个布尔值,如果超过10秒,该布尔值将变为false。稍后在if语句中使用它来检查是否应运行其余代码。

t = Timer
booLoaded = true

Do While ie.Busy Or ie.ReadyState < 4

  DoEvents

  If Timer - t > 10 Then
    booLoaded = false
    Exit Do
    t = ""
  end if

loop

t = timer

if booload = true then
   Do
   DoEvents
'... rest of the do loop here
end if

next i

答案 1 :(得分:1)

我将扩展Const,因为这是应该确定等待时间的地方。然后重新组织代码,以便在可以设置目标元素的情况下更早退出。对Set元素上的No Nothing进行测试,如果Not Is Nothing则仅执行下一行(与该URL相关)。

对于循环内导航的结构,我有点不舒服,想测试一下您是否知道挂起的任何URL。现在,我已经添加了设置元素并导航到同一块中,您可能想在Const中再添加一两秒钟。

Option Explicit   
Public Sub test()
    Const MAX_WAIT_SEC As Long = 10
    'other code
    Set wks = wb.Sheets("wavemotion")
    LastRow = wks.Cells(Rows.Count, "B").End(xlUp).Row

    ie.Visible = True

    For i = 2 To LastRow

        mylink = wks.Cells(i, 2).Value        
        t = Timer
        Do
            DoEvents          
            ie.Navigate2 mylink

            While ie.Busy Or ie.readyState < 4: DoEvents: Wend

            On Error Resume Next
            Set availability = ie.document.querySelector(".stock.in-stock ")
            On Error GoTo 0
            If Timer - t > MAX_WAIT_SEC Then Exit Do
        Loop While availability Is Nothing

        If Not availability Is Nothing Then

            wks.Range(Cells(i, 1), Cells(i, 5)).Interior.ColorIndex = 38
            Set product_sku = ie.document.querySelector(".single-product__sku")
            wks.Cells(i, "A").Value = product_sku.innerText
            Set price = ie.document.querySelector(".price .woocommerce-Price-amount")
            wks.Cells(i, "E").Value = price.innerText
            wks.Cells(i, "D").Value = availability.innerText
            Set product_name = ie.document.querySelector(".single-product__title")
            wks.Cells(i, "C").Value = product_name.innerText
            Set product_color = ie.document.querySelector(".single-product__colors__label ")
            wks.Cells(i, "G").Value = product_color.innerText
            wks.Range(Cells(i, 1), Cells(i, 5)).Interior.ColorIndex = 0
            Set price = Nothing: Set availability = Nothing '.... etc
        End If
    Next i

    ie.Quit
    Set ie = Nothing
End Sub