如何正确刮取此X路径?

时间:2018-08-16 16:22:20

标签: selenium xpath web web-scraping

我正在尝试抓取Google Analytics(分析)以获取用于指标报告的数据

<table class="_GASfb">
  <tbody>
    <tr>
      <td class="_GAB9">
        <div id="ID-overview-sparkline" class="_GAc0">
          <div class="_GAb-_GAci-_GAhi _GAYd" data-guidedhelpid="sparkline-metric-analytics.pageviews-group">
            <div class="_GAgmb _GAYs ID-tooltipcontent-0" data-guidedhelpid="sparkline-metric-analytics.pageviews">Pageviews</div>
            <div class="_GAvo">
              <div>
                <div class="_GAGu">12,188</div>
                <div class="_GARf ACTION-graph TARGET-analytics.pageviews">
                  <div class="ID-chart-0-0-0"><img src="https://chart.googleapis.com/chart?cht=ls&amp;chs=170x18&amp;chco=e6f2fa%2C058DC7&amp;chm=b%2Ce6f2fa%2C0%2C1%2C0&amp;chd=e%3AAAAAAAAAAAAAAA%2C2B..iCisaVeOhP" class="_GAb-serverchart-image" width="170" height="18"></div>
                </div>
              </div>
            </div>
          </div>
          <div class="_GAb-_GAci-_GAhi _GAYd" data-guidedhelpid="sparkline-metric-analytics.uniquePageviews-group">
            <div class="_GAgmb _GAYs ID-tooltipcontent-1" data-guidedhelpid="sparkline-metric-analytics.uniquePageviews">Unique Pageviews</div>
            <div class="_GAvo">
              <div>
                <div class="_GAGu">10,347</div>
                <div class="_GARf ACTION-graph TARGET-analytics.uniquePageviews">
                  <div class="ID-chart-1-0-0"><img src="https://chart.googleapis.com/chart?cht=ls&amp;chs=170x18&amp;chco=e6f2fa%2C058DC7&amp;chm=b%2Ce6f2fa%2C0%2C1%2C0&amp;chd=e%3AAAAAAAAAAAAAAA%2C5V..iylTcMeihF" class="_GAb-serverchart-image" width="170" height="18"></div>
                </div>
              </div>
            </div>
          </div>
          <div class="_GAb-_GAci-_GAhi _GAYd" data-guidedhelpid="sparkline-metric-analytics.avgPageDuration-group">
            <div class="_GAgmb _GAYs ID-tooltipcontent-2" data-guidedhelpid="sparkline-metric-analytics.avgPageDuration">Avg. Time on Page</div>
            <div class="_GAvo">
              <div>
                <div class="_GAGu">00:01:21</div>
                <div class="_GARf ACTION-graph TARGET-analytics.avgPageDuration">
                  <div class="ID-chart-2-0-0"><img src="https://chart.googleapis.com/chart?cht=ls&amp;chs=170x18&amp;chco=e6f2fa%2C058DC7&amp;chm=b%2Ce6f2fa%2C0%2C1%2C0&amp;chd=e%3AAAAAAAAAAAAAAA%2C..5z3A8-3EuI2m" class="_GAb-serverchart-image" width="170" height="18"></div>
                </div>
              </div>
            </div>
          </div>
          <div class="_GAb-_GAci-_GAhi _GAYd" data-guidedhelpid="sparkline-metric-analytics.bounceRate-group">
            <div class="_GAgmb _GAYs ID-tooltipcontent-3" data-guidedhelpid="sparkline-metric-analytics.bounceRate">Bounce Rate</div>
            <div class="_GAvo">
              <div>
                <div class="_GAGu">80.83%</div>
                <div class="_GARf ACTION-graph TARGET-analytics.bounceRate">
                  <div class="ID-chart-3-0-0"><img src="https://chart.googleapis.com/chart?cht=ls&amp;chs=170x18&amp;chco=e6f2fa%2C058DC7&amp;chm=b%2Ce6f2fa%2C0%2C1%2C0&amp;chd=e%3AAAAAAAAAAAAAAA%2C-V9D-0-4...J.9" class="_GAb-serverchart-image" width="170" height="18"></div>
                </div>
              </div>
            </div>
          </div>
          <div class="_GAb-_GAci-_GAhi _GAYd" data-guidedhelpid="sparkline-metric-analytics.exitRate-group">
            <div class="_GAgmb _GAYs ID-tooltipcontent-4" data-guidedhelpid="sparkline-metric-analytics.exitRate">% Exit</div>
            <div class="_GAvo">
              <div>
                <div class="_GAGu">68.56%</div>
                <div class="_GARf ACTION-graph TARGET-analytics.exitRate">
                  <div class="ID-chart-4-0-0"><img src="https://chart.googleapis.com/chart?cht=ls&amp;chs=170x18&amp;chco=e6f2fa%2C058DC7&amp;chm=b%2Ce6f2fa%2C0%2C1%2C0&amp;chd=e%3AAAAAAAAAAAAAAA%2C.6.m-X..-t6J7k" class="_GAb-serverchart-image" width="170" height="18"></div>
                </div>
              </div>
            </div>
          </div>
        </div>
      </td>
    </tr>
  </tbody>
</table>

这是整个HTML。我正在尝试抓住所有

<div class="_GAGu">12,188</div>

我已经设置了我的Web驱动程序以登录google,迁移到该页面并在url中设置正确的日期。

然后我尝试使用

获取数据
data = driver.find_elements_by_xpath("//div[@class='_GAGu']")

它什么也不返回。我已经尝试过

data = driver.find_element_by_xpath("//div[@class='_GAGu']")

(单个元素)和页面上的其他对象。我没有任何成功。我以前使用过硒网刮擦,从未遇到过这样的问题。有人知道我将如何获取这些数据吗?

预先感谢

编辑:

我正在使用API​​。解决了我的权限错误。容易得多

0 个答案:

没有答案