VBa Excel IE无法在innerHTML中找到Table

时间:2015-03-12 16:06:43

标签: html excel vba internet-explorer

我正在尝试从网页复制表格。我无法复制整个页面,因为它有按钮和动态元素,并且由于内存过载而将它们粘贴到表中会破坏代码,所以我试图拉动HTML并将表格粘贴到excel中。

当我将整个源代码文本复制到Word中时,它告诉我有大约23k个字母,但是当我使用innerHTML或outerHTML时,它们的长度都在15-16k左右。

我知道内部和外部都缺少很多函数,比如HTML体外部,但令我困惑的是它们在代码中间缺少我需要的表。

网站代码:

<div class="row" >
            <div class="col-lg-12 col-md-12 col-sm-12" >


            </div>

                    <div class="col-lg-12 col-md-12 col-sm-12" >
                        <table class="table table-hover table-bordered table-striped " >
                            <thead>
                                <tr style="background:#eee">
                                    <th class="sortable" ><a href="/employer/report/report?_action_report=Run+Report&amp;show_advertisers=on&amp;_show_conversions=&amp;advertiser_id=25&amp;_show_advertisers=&amp;_hide_campaigns=&amp;_show_campaigns=&amp;end=03%2F11%2F2015&amp;campaign_id=&amp;begin=03%2F11%2F2015&amp;sort=day&amp;order=asc">Date</a></th>



                                    <th class="sortable" ><a href="/employer/report/report?_action_report=Run+Report&amp;show_advertisers=on&amp;_show_conversions=&amp;advertiser_id=25&amp;_show_advertisers=&amp;_hide_campaigns=&amp;_show_campaigns=&amp;end=03%2F11%2F2015&amp;campaign_id=&amp;begin=03%2F11%2F2015&amp;sort=jobs&amp;order=asc">Current Jobs Listed</a></th>

                                    <th class="sortable" ><a href="/employer/report/report?_action_report=Run+Report&amp;show_advertisers=on&amp;_show_conversions=&amp;advertiser_id=25&amp;_show_advertisers=&amp;_hide_campaigns=&amp;_show_campaigns=&amp;end=03%2F11%2F2015&amp;campaign_id=&amp;begin=03%2F11%2F2015&amp;sort=impressions&amp;order=asc">Impressions</a></th>
                                    <th class="sortable" ><a href="/employer/report/report?_action_report=Run+Report&amp;show_advertisers=on&amp;_show_conversions=&amp;advertiser_id=25&amp;_show_advertisers=&amp;_hide_campaigns=&amp;_show_campaigns=&amp;end=03%2F11%2F2015&amp;campaign_id=&amp;begin=03%2F11%2F2015&amp;sort=clicks&amp;order=asc">Clicks</a></th>

                                    <th class="sortable" ><a href="/employer/report/report?_action_report=Run+Report&amp;show_advertisers=on&amp;_show_conversions=&amp;advertiser_id=25&amp;_show_advertisers=&amp;_hide_campaigns=&amp;_show_campaigns=&amp;end=03%2F11%2F2015&amp;campaign_id=&amp;begin=03%2F11%2F2015&amp;sort=cpc&amp;order=asc">CPC</a></th>
                                    <th class="sortable" ><a href="/employer/report/report?_action_report=Run+Report&amp;show_advertisers=on&amp;_show_conversions=&amp;advertiser_id=25&amp;_show_advertisers=&amp;_hide_campaigns=&amp;_show_campaigns=&amp;end=03%2F11%2F2015&amp;campaign_id=&amp;begin=03%2F11%2F2015&amp;sort=ctr&amp;order=asc">CTR</a></th>
                                    <th class="sortable" ><a href="/employer/report/report?_action_report=Run+Report&amp;show_advertisers=on&amp;_show_conversions=&amp;advertiser_id=25&amp;_show_advertisers=&amp;_hide_campaigns=&amp;_show_campaigns=&amp;end=03%2F11%2F2015&amp;campaign_id=&amp;begin=03%2F11%2F2015&amp;sort=cost&amp;order=asc">Estimated cost</a></th>

                                    <th class="sortable" ><a href="/employer/report/report?_action_report=Run+Report&amp;show_advertisers=on&amp;_show_conversions=&amp;advertiser_id=25&amp;_show_advertisers=&amp;_hide_campaigns=&amp;_show_campaigns=&amp;end=03%2F11%2F2015&amp;campaign_id=&amp;begin=03%2F11%2F2015&amp;sort=daily_budget&amp;order=asc">Current Daily Budget</a></th>
                                    <th style="vertical-align:top" ><a href="#" onclick="return false;">Edit Campaign</a></th>
                                    <th style="vertical-align:top" ></th>

                                </tr>
                            </thead>
                            <tbody>

                                    <tr class="odd 2015-03-11">
                                        <td>2015-03-11</td>



                                        <td class="jobsListed" >437879</td>

                                        <td>148397</td>
                                        <td>1379</td>

                                        <td>$0.36</td>
                                        <td>0.93%</td>
                                        <td >$491.16</td>

                                        <td class="dailyBudget">$15500.00</td>
                                        <td ><a href="/employer/campaign/">Edit</a></td>

                                    </tr>

                                <tr class="dg" >

                                    <td  colspan="1"  class="text-right"><b>Total:</b></td>

                                    <td class="jobsListed" >437879</td>

                                    <td>148397</td>
                                    <td>1379</td>

                                    <td>$0.36</td>
                                    <td>0.93%</td>
                                    <td >$491.16</td>

                                    <td class="dailyBudget">$15500.00</td>
                                    <td ></td>
                                    <td ></td>
                                </tr>
                            </tbody>
                        </table>
                    </div>

        </div>


        </div><!--container ends here -->

以下是我试图获取表数据的方法:

Dim appIE As Object ' InternetExplorer.Application
Set appIE = CreateObject("InternetExplorer.Application")


    Dim strSource As String
    Dim TableString As String
    strSource = CStr(appIE.document.body.outerHTML)
    TableString = Mid(strSource, _
    InStr(strSource, "<table"), _
    InStr(strSource, "</table>") - InStr(strSource, "<table"))

    Dim ClipBoard As New DataObject
    ClipBoard.SetText TableString
    ClipBoard.PutInClipboard

它给了我一个错误,因为它在字符串中找不到<table。我几次穿过琴弦,发现桌子应该是这样的空间:

 class="col-lg-12 col-md-12 col-sm-12">


            </div>

        </div>


        </div><!--container ends here -->

有什么想法吗?感谢

1 个答案:

答案 0 :(得分:0)

我终于弄明白了问题是什么!

IE正在以可视方式加载页面,但机器仍然认为它在登录屏幕上。我能够看到这个的方式是通过立即窗口中的appIE.LocationURL

因此,它无法在页面上找到该表,因为它在登录页面上不存在。

这个问题的解决方案很简单。

  1. 检查机器是否认为它位于登录页面或数据页
  2. 如果不是登录 - 重新创建IE应用程序并加载新窗口 和页面。 (但这已经登录了,所以请确保 有一个检查,看看你是否已经登录。我这样做了 寻找我知道的页面上的特定文本仅在 数据页面,而不是登录页面。)
  3. 确保关闭所有IE窗口 - 正如现在只做appIE.Quit只关闭最近的寡妇。
  4. 确保您有错误句柄或者这可能永远循环。
  5. 代码:

    MakeIE:
    set appIE = CreateObject("InternetExplorer.Application")
    ...
    With appIE
        .Navigate sURL
        Application.Wait (Now + TimeValue("00:00:01"))
        .Visible = True
        .Height = 500
        .Width = 500
        Application.Wait (Now + TimeValue("00:00:01"))
        ' loop until the page finishes loading
        Do Until .ReadyState = 4: DoEvents: Loop
    End With
    ....
    If appIE.LocationURL <> sURL Then GoTo MakeIE
    

    杀死aLL IE窗口的代码(谨慎使用 - 将杀死所有IE):

    Option Explicit
    Sub IE_Sledgehammer()
        Dim objWMI As Object, objProcess As Object, objProcesses As Object
        Set objWMI = GetObject("winmgmts://.")
        Set objProcesses = objWMI.ExecQuery( _
            "SELECT * FROM Win32_Process WHERE Name = 'iexplore.exe'")
        For Each objProcess In objProcesses
            On Error Resume Next
            Call objProcess.Terminate
            On Error GoTo 0
        Next
        Application.Wait (Now + TimeValue("0:00:03"))
        Set objProcesses = Nothing: Set objWMI = Nothing
        Application.Wait (Now + TimeValue("0:00:03"))
    End Sub