VBA可以自动执行网络抓取进入的单元格

时间:2019-07-02 20:02:13

标签: excel vba loops web-scraping automation

我是VBA的新手,如果这个问题不好,我深表歉意。

我可以在excel中列出数字吗,并且可以自动将它们逐个输入到网页中吗?用户只需要登录一次,但是每次用户需要单击按钮之前,都需要从网页获取有关号码的信息。

我有一个数字列表,如下所示:

enter image description here

我想自动化将它们输入网页的过程。 我能够输入第一个数字,按下按钮并将数据复制回excel。但是,我不确定如何使程序循环,以便对列中的每个数字执行该循环。现在,我已经设置了代码,因此应该可以正常工作,但是它只是执行第一个数字并停止。

以下是一个示例(我刚才的 ),该示例反映了我正在使用的网站。它从此页面进入,在该页面中,用户输入用户名和密码,然后按Enter。...

enter image description here

....到此页面,用户单击启动,然后单击复选框,然后单击查看(以及其中的每个“行”按钮仅在单击上一个按钮时显示)。 enter image description here

....到此页面,用户单击历史记录,然后输入数字之一。 enter image description here

摘要: 我当前登录到该网站的代码,输入用户名信息和密码。按Enter键。 然后转到下一个屏幕,单击“启动”,然后选中复选框,然后进行检查。 接下来,进入下一个屏幕,程序在该屏幕上单击历史记录按钮。 然后返回到excel(到A列),取出第一个数字,并在显示“此处为数字”的地方输入它,然后单击Enter。这使我进入一个包含大量信息的页面,然后将其复制并粘贴回excel。

我再次针对这些因素运行该程序。

但是,我相信我的代码应该移至列中的下一个数字(即,首先对单元格A3执行上述步骤,然后对单元格A4执行上述操作,等等) ),但 不是

下面是我的代码:

*我正在发布代码的简化版。我在上面说过我想如何将其复制/粘贴回excel等。但这一切都按照我想要的方式进行。我的主要问题是弄清楚如何循环播放它,以便登录,单击按钮,从excel输入数字,然后返回到第一个屏幕,再次单击按钮,然后输入下一个数字,等等。

Option Explicit

Sub _NumberFix()

Dim IE As Object
Dim IeDoc As Object
Dim aInput As Object
Dim eInput As Object
Dim svalue1 As Object
Dim a As Object
Dim b As Object
Dim elems As Object
Dim t As Date
Dim i As Long, lastrow As Long
Dim results As Variant, wkshtnames()
Dim ws As Worksheet, wks As Excel.Worksheet
Dim NewName As String
Dim sheet As Worksheet
Dim duplicate As Boolean

Const MAXWAIT_sec As Long = 10

Set ws = Sheets("VALUES")

Set IE = CreateObject("InternetExplorer.application")
IE.Visible = True
IE.Navigate ("http://mywebsite.com/")

    Do While IE.busy: DoEvents: Loop

Set IeDoc = IE.document

'Enters username and password
    With IeDoc.all
        .UserName.Value = "userr"
        .Password.Value = "password"
    End With

    With IE.document.forms("signingin")
        .document.forms(0).submit
    End With


Application.Wait (Now + TimeValue("0:00:03"))


Set IeDoc = IE.document ' set new page source

    t = Timer

    Do
        On Error Resume Next
        Set elems = IeDoc.queryselector("input[value=Initiate]")
        On Error GoTo 0
        If Timer - t > MAXWAIT_sec Then
            Exit Do
        End If
    Loop While elems Is Nothing

    If Not elems Is Nothing Then
        elems.Item.Click
    End If

    Application.Wait (Now + TimeValue("0:00:03"))


    IeDoc.getElementByID("checkConf").Click


    For Each aInput In IeDoc.getElementsbyTagName("input")
        If aInput.getAttribute("value") = "Request" Then
            aInput.Click
            Exit For
        End If
    Next aInput


    Do While IE.busy: DoEvents: Loop

    'Selects historical
    For Each aInput In IeDoc.getElementsbyTagName("input")
        If aInput.getAttribute("value") = "History" Then
            aInput.Click
            Exit For
        End If
    Next aInput


    lastrow = ws.Cells(ws.rows.Count, "A").End(xlUp).Row
    IE.Visible = True

    For i = 3 To lastrow

    Set IeDoc = IE.document ' set new page source

        Set svalue1 = IeDoc.getElementByID("Number")
        svalue1.Value = ws.Cells(i, 1).Value 'takes the  number out and enters
            'presses submit once numb is entered
            For Each aInput In IeDoc.getElementsbyTagName("input")
                If aInput.getAttribute("value") = "Submit" Then
                    aInput.Click
                    Exit For
                End If
            Next aInput

        IE.Navigate ("https://mywebsite.com/")
        Do While (IE.READYSTATE <> 4 Or IE.busy <> False): DoEvents: Loop
        IE.Visible = True
        Exit For


Next i
        Do While (IE.READYSTATE <> 4 Or IE.busy <> False): DoEvents: Loop
        IE.Visible = True


End Sub                

2 个答案:

答案 0 :(得分:2)

我清理了这个镜头-请尝试以下代码。

您的问题很难回答,因为您的代码不仅难以遵循,而且缩进效果很差,并且变量不明确。您还声明了从未使用过的变量。

Option Explicit
Public Const UBlim As Long = 6
Sub Login()

    Dim IE As Object
    Dim eInput As Object
    Dim ws As Worksheet: Set ws = Sheets("VALUES")
    Dim i As Long
    Dim j As Long
    Dim lastrow As Long
    Dim svalue1 As Object
    Dim results As Variant
    Dim wkscnt As Long
    Dim wks As Excel.Worksheet
    Dim wkshtnames()
    Dim a As Object
    Dim b As Object
    Dim t As Date
    Dim elems As Object
    Const MAXWAIT_sec As Long = 10

    Set IE = CreateObject("InternetExplorer.application")
    IE.Visible = True
    IE.Navigate ("http://mywebsite.com/")

    With IE

        Do
            If IE.readystate = 4 Then
                Exit Do
            Else
                DoEvents
            End If
        Loop

        'Enters username and password

        With .document
            .forms("signingin").UserName.Value = "userr"
            .forms("signingin").Password.Value = "password"
            .forms("signingin").document.forms(0).submit

            'this clicks a button after logging in that says initiate new request
            t = Timer

            Do
                On Error Resume Next
                Set elems = .document.queryselectorall("input[value=Initiate]")
                On Error GoTo 0

                If Timer - t > MAXWAIT_sec Then
                    Exit Do
                End If

            Loop While elems Is Nothing

            If Not elems Is Nothing Then
                elems.Item.Click
            End If

            Application.Calculation = xlCalculationManual
            Application.CutCopyMode = False
            wkscnt = ThisWorkbook.Sheets.Count
            j = 0

            For Each wks In ActiveWorkbook.Worksheets

                j = j + 1

                If j > UBlim Then
                    ReDim Preserve wkshtnames(7 To wkscnt)
                    wkshtnames(j) = wks.Name
                End If

            Next wks

            If wkscnt > UBlim Then
                Application.DisplayAlerts = False
                Sheets(wkshtnames).Delete
                Application.DisplayAlerts = True
            End If

            lastrow = ws.Cells(ws.Rows.Count, "A").End(xlUp).Row

            For i = 3 To lastrow
                Set svalue1 = .getElementbyID("Number")
                svalue1.Value = ws.Cells(i, 1).Value
                i = i + 1

                For Each eInput In .getElementsbyTagName("input")
                    If eInput.getAttribute("value") = "Submit Request" Then
                        eInput.Click
                        Exit For
                    End If
                Next eInput

                IE.Visible = True

                'copy and pasting the info from the web page to a new excel sheet
                Sheets("Sheet4").Range("A1:Z100").ClearContents
                IE.ExecWB 17, 0 '//select
                IE.ExecWB 12, 2 '//Copy Selection
                ActiveSheet.Paste

                Sheets("Sheet4").Range("A3:Q32").Copy

                'Creates a new sheet after & pastes content into it, formats
                Sheets.Add After:=ActiveSheet
                ActiveSheet.Paste
                Selection.Columns.AutoFit
                Selection.Rows.AutoFit

                ActiveSheet.Protect
                'this navigates back to the page where I need to enter the value in the excel column again

                IE.Navigate ("https://mywebsite.com/Default")
            Next i

        End With

    End With

End Sub

答案 1 :(得分:1)

好吧,在问了几次这个问题并试图找到解决这个问题的不同方法后,我有 FOUND /写)我的答案!

这里是:

Option Explicit

Sub NewScrape()

Dim IE As Object
Dim IeDoc As Object
Dim aInput As Object
Dim eInput As Object
Dim svalue1 As Object
Dim a As Object
Dim b As Object
Dim elems As Object
Dim t As Date
Dim i As Long, lastrow As Long
Dim results As Variant, wkshtnames()
Dim ws As Worksheet, wks As Excel.Worksheet
Dim NewName As String
Dim sheet As Worksheet
Dim duplicate As Boolean

Const MAXWAIT_sec As Long = 10

Set ws = Sheets("VALUE")

Set IE = CreateObject("InternetExplorer.application")
IE.Visible = True
IE.Navigate ("http://mywebsite.com/")

    Do While IE.busy: DoEvents: Loop

Set IeDoc = IE.document

'Enters username and password
    With IeDoc
        .forms("signingin").UserName.Value = "userr"
        .forms("signingin").Password.Value = "password"
        .forms("signingin").document.forms(0).submit
    End With


Application.Wait (Now + TimeValue("0:00:03"))

    lastrow = ws.Cells(ws.rows.Count, "A").End(xlDown).Row
    IE.Visible = True

    For i = 3 To lastrow

Set IeDoc = IE.document ' set new page source

    t = Timer

    Do
        On Error Resume Next
        Set elems = IeDoc.queryselector("input[value=Initiate]")
        On Error GoTo 0
        If Timer - t > MAXWAIT_sec Then
            Exit Do
        End If
    Loop While elems Is Nothing

    If Not elems Is Nothing Then
        elems.Item.Click
    End If

    Application.Wait (Now + TimeValue("0:00:03"))


    IeDoc.getElementByID("checkConf").Click


    For Each aInput In IeDoc.getElementsbyTagName("input")
         If aInput.getAttribute("value") = "Request" Then
            aInput.Click
            Exit For
        End If
    Next aInput


    Do While IE.busy: DoEvents: Loop

    'Selects history
    For Each aInput In IeDoc.getElementsbyTagName("input")
        If aInput.getAttribute("value") = "History" Then
            aInput.Click
            Exit For
        End If
    Next aInput


        Set svalue1 = IeDoc.getElementByID("accountNumber")
        svalue1.Value = ws.Cells(i, 1).Value 'takes the  number out and enters
            'presses submit once acct numb is entered
            For Each aInput In IeDoc.getElementsbyTagName("input")
                If aInput.getAttribute("value") = "Submit Request" Then
                    aInput.Click
                    Exit For
                 End If
            Next aInput


    Do While (IE.READYSTATE <> 4 Or IE.busy <> False): DoEvents: Loop
    IE.Visible = True



    'Webpage sometimes takes time to load.
    Application.Wait DateAdd("s", 10, Now)


    'Selects and clears sheet 4
    Sheets("Sheet4").Select
    Range("A1:Z100").Select
    Selection.ClearContents

    Do While (IE.READYSTATE <> 4 Or IE.busy <> False): DoEvents: Loop


    IE.ExecWB 17, 0 '//select all from webpage
    IE.ExecWB 12, 2 '//Copy Selection

    Do While (IE.READYSTATE <> 4 Or IE.busy <> False): DoEvents: Loop

    Application.Wait DateAdd("s", 2, Now)

    Application.DisplayAlerts = False  '//Doesnt display alerts
    ActiveSheet.Paste

    Do While (IE.READYSTATE <> 4 Or IE.busy <> False): DoEvents: Loop


    Sheets("Sheet4").Select '//Selects sheet 4 again
    Range("A3:Q32").Select
    Selection.Copy

    Do While (IE.READYSTATE <> 4 Or IE.busy <> False): DoEvents: Loop

    Application.Wait DateAdd("s", 2, Now)

    'Creates a new sheet after & pastes content into it, formats
    Sheets.Add After:=ActiveSheet
    ActiveSheet.Paste
    Selection.Columns.AutoFit
    Selection.rows.AutoFit

    Do While (IE.READYSTATE <> 4 Or IE.busy <> False): DoEvents: Loop


    'If the worksheet already has been made:
    duplicate = False
    For Each sheet In ThisWorkbook.Sheets
        If sheet.Name = Range("D10") Then
            MsgBox ("ERROR: This Numb has already been formulated")
            NewName = InputBox("Please Rename:")
            ActiveSheet.Name = NewName
            duplicate = True
                Exit For
        End If
    Next sheet

    If duplicate = False Then
        ActiveSheet.Name = Range("d10")
        Range("A6").Clear
        ActiveSheet.Protect
       ' MsgBox ("DONE, next...")
    End If


            'this navigates back to the page where I need to enter the value in the excel column again


        IE.Navigate ("https://mywebsite.com/Default")
        Do While (IE.READYSTATE <> 4 Or IE.busy <> False): DoEvents: Loop
        IE.Visible = True


Next i
        Do While (IE.READYSTATE <> 4 Or IE.busy <> False): DoEvents: Loop
        IE.Visible = True

        On Error GoTo 0
        MsgBox ("All Accounts Have Been Formulated, Check it out!")




  End Sub

所以基本上我认为我的问题在于放置lastrow=的位置  行和i=3等行,以及之后还有Set IeDoc = IE.document行。 :)