使用selenium webdriver循环通过表

时间:2016-02-27 21:53:40

标签: java selenium xpath selenium-webdriver

我有一个表可以在这里找到:Ontario Gov Employee Directory我正在尝试遍历表来提取数据,但是很难找到xpath才能这样做。

当我检查我看到的元素时,表没有id:

<table title="results_list" border="0" width="100%" cellspacing="0" cellpadding="0">

  <tbody>
    <tr>
      <td class="content" valign="top" align="right" width="50">1. &nbsp;</td>
      <td class="content">[<a class="results" href='javascript:showEmployeeDetail("32528")'>Aagaard, Lindsay</a>] [ Senior Policy Advisor ] [TREASURY BOARD SECRETARIAT]
        <br>[DEPUTY PREMIER AND PRESIDENT OF THE TREASURY BOARD, Toronto]

        <!-- [416-327-0948]  -->



        [416-327-0948] [



        <a href="mailto:lindsay.aagaard@ontario.ca">
                                                                            lindsay.aagaard@ontario.ca</a>]
      </td>
    </tr>
    <tr>
      <td>&nbsp;</td>
    </tr>

    <tr>
      <td class="content" valign="top" align="right" width="50">2. &nbsp;</td>
      <td class="content">[<a class="results" href='javascript:showEmployeeDetail("34417")'>Aalto, Margaret</a>] [ Probation Officer ] [CHILDREN AND YOUTH SERVICES]
        <br>[THUNDER BAY, Thunder Bay]

        <!-- [807-475-1310]  -->



        [807-475-1310] [



        <a href="mailto:margaret.aalto@ontario.ca">
                                                                            margaret.aalto@ontario.ca</a>]
      </td>
    </tr>
    <tr>
      <td>&nbsp;</td>
    </tr>

    <tr>
      <td class="content" valign="top" align="right" width="50">3. &nbsp;</td>
      <td class="content">[<a class="results" href='javascript:showEmployeeDetail("9187")'>Aarlaht, Andrew</a>] [ Business Analyst ] [COMMUNITY AND SOCIAL SERVICES]
        <br>[HAMILTON, BUSINESS SERVICES UNIT, Hamilton]

        <!-- [905-521-7335]  -->



        [905-521-7335] [



        <a href="mailto:andrew.aarlaht@ontario.ca">
                                                                            andrew.aarlaht@ontario.ca</a>]
      </td>
    </tr>
    <tr>
      <td>&nbsp;</td>
    </tr>

    <tr>
      <td class="content" valign="top" align="right" width="50">4. &nbsp;</td>
      <td class="content">[<a class="results" href='javascript:showEmployeeDetail("9187")'>Aarlaht, Andrew</a>] [ Business Analyst ] [CHILDREN AND YOUTH SERVICES]
        <br>[HAMILTON, BUSINESS SERVICES UNIT, Hamilton]

        <!-- [905-521-7335]  -->



        [905-521-7335] [



        <a href="mailto:andrew.aarlaht@ontario.ca">
                                                                            andrew.aarlaht@ontario.ca</a>]
      </td>
    </tr>
    <tr>
      <td>&nbsp;</td>
    </tr>

    <tr>
      <td class="content" valign="top" align="right" width="50">5. &nbsp;</td>
      <td class="content">[<a class="results" href='javascript:showEmployeeDetail("19146")'>Aarons, Drew</a>] [ Messenger ] [LEGISLATIVE OFFICES]
        <br>[PARLIAMENTARY PROTOCOL, Toronto]

        <!-- [416-325-7455]  -->



        [416-325-7455] [



        <a href="mailto:daarons@ola.org">
                                                                            daarons@ola.org</a>]
      </td>
    </tr>
    <tr>
      <td>&nbsp;</td>
    </tr>

    <tr>
      <td class="content" valign="top" align="right" width="50">6. &nbsp;</td>
      <td class="content">[<a class="results" href='javascript:showEmployeeDetail("113729")'>Aaswaakshin, Neegann</a>] [ Articling Student ] [ABORIGINAL AFFAIRS]
        <br>[LEGAL SERVICES, Toronto]

        <!-- [416-212-2271]  -->



        [416-212-2271] [



        <a href="mailto:Neegann.Aaswaakshin@ontario.ca">
                                                                            Neegann.Aaswaakshin@ontario.ca</a>]
      </td>
    </tr>
    <tr>
      <td>&nbsp;</td>
    </tr>

    <tr>
      <td class="content" valign="top" align="right" width="50">7. &nbsp;</td>
      <td class="content">[<a class="results" href='javascript:showEmployeeDetail("32196")'>Abad, Lilian</a>] [ Executive Assistant ] [TRANSPORTATION]
        <br>[GO TRANSIT, Toronto]

        <!-- [416-202-5506]  -->



        [416-202-5506] [



        <a href="mailto:lilian.abad@gotransit.com">
                                                                            lilian.abad@gotransit.com</a>]
      </td>
    </tr>
    <tr>
      <td>&nbsp;</td>
    </tr>

    <tr>
      <td class="content" valign="top" align="right" width="50">8. &nbsp;</td>
      <td class="content">[<a class="results" href='javascript:showEmployeeDetail("114240")'>Abadesso, Jennifer</a>] [ Employment Program Consultant (Acting) ] [TRAINING, COLLEGES AND UNIVERSITIES]
        <br>[FOUNDATION SKILLS, Toronto]

        <!-- [416-327-2065]  -->



        [416-327-2065] [



        <a href="mailto:jennifer.abadesso@ontario.ca">
                                                                            jennifer.abadesso@ontario.ca</a>]
      </td>
    </tr>
    <tr>
      <td>&nbsp;</td>
    </tr>

    <tr>
      <td class="content" valign="top" align="right" width="50">9. &nbsp;</td>
      <td class="content">[<a class="results" href='javascript:showEmployeeDetail("104293")'>Abakunzi, Louis</a>] [ Customer Service Representative (Bilingual) ] [GOVERNMENT AND CONSUMER SERVICES]
        <br>[SERVICEONTARIO CONTACT CENTRE - NORTH YORK, Toronto]

        <!-- [416-235-2999]  -->



        [416-235-2999] [



        <a href="mailto:Louis.K.Abakunzi@ontario.ca">
                                                                            Louis.K.Abakunzi@ontario.ca</a>]
      </td>
    </tr>
    <tr>
      <td>&nbsp;</td>
    </tr>

    <tr>
      <td class="content" valign="top" align="right" width="50">10. &nbsp;</td>
      <td class="content">[<a class="results" href='javascript:showEmployeeDetail("19309")'>Aban, Edencio</a>] [ Audit Supervisor ] [ATTORNEY GENERAL]
        <br>[AUDIT AND COMPLIANCE, Toronto]

        <!-- [416-326-6295]  -->



        [416-326-6295] [



        <a href="mailto:edencio.aban@agco.ca">
                                                                            edencio.aban@agco.ca</a>]
      </td>
    </tr>
    <tr>
      <td>&nbsp;</td>
    </tr>

  </tbody>
</table>

如何遍历这些行中的数据?

1 个答案:

答案 0 :(得分:1)

它是表格中的一个表格,然后有一些非常标准的格式。你有什么挑战?

  

当我检查我看到的元素时,表格没有id:

它可以使用其他属性,例如标题。使用xpath //table[@title="results_list"]/tbody/tr/td查找最里面的表中的每个数据元素。或者从xpath中删除最后一个/td以获取每一行。之后,找到其下的每个td元素并使用其text

注意:最里面的表有第一列带有序列号,第二列带有实际数据。我建议让每个td然后使用&#39; innerHTML&#39;属性或elem.text。之后,使用常规的exppresion来提取不同的部分。

>>> all_tdata = driver.find_elements_by_xpath('//table[@title="results_list"]/tbody/tr/td')
>>> for td in all_tdata:
...     print td.get_attribute('innerHTML')  # save this in var and regex it
...     # or
...     data = td.text