python selenium获取所有div

时间:2018-04-20 06:54:36

标签: python selenium selenium-webdriver web-scraping

我正在尝试抓取一个网站并获取HTML中的所有div元素。 我尝试访问的网页是一个包含工作机会的页面,其中每个作业都位于单独的div内。 我试图让它们全部通过for循环运行它们并分别提取它们的数据。

我还没有编码。

是否有特定的方法或方法来获取它们? 而且我只想提取包含作业的div

外部HTML

    <div class="row result clickcard" id="p_1d1829a543b1f3a7" data-jk="1d1829a543b1f3a7" data-tn-component="organicJob" data-tu="">
<h2 id="jl_1d1829a543b1f3a7" class="jobtitle">
    <a href="/rc/clk?jk=1d1829a543b1f3a7&amp;fccid=95ea1992d038e9af&amp;vjs=3" target="_blank" rel="noopener nofollow" onmousedown="return rclk(this,jobmap[1],0);" onclick="setRefineByCookie([]); return rclk(this,jobmap[1],true,0);" title="Planner II" class="turnstileLink" data-tn-element="jobTitle">Planner II</a>
    - <span class="new">new</span></h2>
<span class="company">
    Vantage Utility Services</span>

 - <span class="location">Upland, CA</span>
    <table cellspacing="0" cellpadding="0" border="0">
<tbody><tr>
<td class="snip">
<div class="">
    <span class="summary">
            Transportation or giving away of up to 28.5 grams of marijuana, other than concentrated <b>cannabis</b>, or the offering to transport or give away up to 28.5 grams of...</span>
    </div>


<div class="result-link-bar-container">
    <div class="result-link-bar"><span class="date">4 hours ago</span> <span id="tt_set_1" class="tt_set">  -  <a id="sj_1d1829a543b1f3a7" href="#" class="sl resultLink save-job-link " onclick="changeJobState('1d1829a543b1f3a7', 'save', 'linkbar', false, ''); return false;" title="Save this job to my.indeed">save job</a> - <a href="#" id="tog_1" class="sl resultLink more-link " onclick="toggleMoreLinks('1d1829a543b1f3a7'); return false;">more...</a></span><div id="editsaved2_1d1829a543b1f3a7" class="edit_note_content" style="display:none;"></div><script>if (!window['result_1d1829a543b1f3a7']) {window['result_1d1829a543b1f3a7'] = {};}window['result_1d1829a543b1f3a7']['showSource'] = false; window['result_1d1829a543b1f3a7']['source'] = "Vantage Utility Services"; window['result_1d1829a543b1f3a7']['loggedIn'] = false; window['result_1d1829a543b1f3a7']['showMyJobsLinks'] = false;window['result_1d1829a543b1f3a7']['undoAction'] = "unsave";window['result_1d1829a543b1f3a7']['relativeJobAge'] = "4 hours ago";window['result_1d1829a543b1f3a7']['jobKey'] = "1d1829a543b1f3a7"; window['result_1d1829a543b1f3a7']['myIndeedAvailable'] = true; window['result_1d1829a543b1f3a7']['showMoreActionsLink'] = window['result_1d1829a543b1f3a7']['showMoreActionsLink'] || true; window['result_1d1829a543b1f3a7']['resultNumber'] = 1; window['result_1d1829a543b1f3a7']['jobStateChangedToSaved'] = false; window['result_1d1829a543b1f3a7']['searchState'] = "q=Cannabis&amp;fromage=last"; window['result_1d1829a543b1f3a7']['basicPermaLink'] = "https://www.indeed.com"; window['result_1d1829a543b1f3a7']['saveJobFailed'] = false; window['result_1d1829a543b1f3a7']['removeJobFailed'] = false; window['result_1d1829a543b1f3a7']['requestPending'] = false; window['result_1d1829a543b1f3a7']['notesEnabled'] = true; window['result_1d1829a543b1f3a7']['currentPage'] = "serp"; window['result_1d1829a543b1f3a7']['sponsored'] = false;window['result_1d1829a543b1f3a7']['reportJobButtonEnabled'] = false; window['result_1d1829a543b1f3a7']['showMyJobsHired'] = false; window['result_1d1829a543b1f3a7']['showSaveForSponsored'] = false; window['result_1d1829a543b1f3a7']['showJobAge'] = true;</script></div></div>

<div class="tab-container">
    <div id="tt_display_1" class="more-links-container result-tab" style="display:none;"><a class="close-link closeLink" title="Close" href="#" onclick="toggleMoreLinks('1d1829a543b1f3a7'); return false;"></a><div id="more_1" class="more_actions"><ul><li><span class="mat">View all <a href="/q-Vantage-Utility-Services-l-Upland,-CA-jobs.html" rel="nofollow">Vantage Utility Services jobs in Upland, CA</a> - <a href="/l-Upland,-CA-jobs.html">Upland jobs</a></span></li><li><span class="mat">Salary Search: <a href="/salaries/Planner-Salaries,-Upland-CA" onmousedown="this.href = appendParamsOnce(this.href, '?campaignid=serp-more&amp;fromjk=1d1829a543b1f3a7&amp;from=serp-more-nofollow');" rel="&quot;nofollow&quot;">Planner salaries in Upland, CA</a></span></li><li><span class="mat">Learn more about working at <a href="/cmp/Vantage-Utility-Services" onmousedown="this.href = appendParamsOnce(this.href, '?fromjk=1d1829a543b1f3a7&amp;from=serp-more&amp;campaignid=serp-more&amp;jcid=424bbfe9ea0cfaab');">Vantage Utility Services</a></span></li><li><span class="mat">Related forums: <a href="/forum/loc/Upland-California.html">Upland, California</a> - <a href="/forum/job/Planner.html">Planner</a> - <a href="/forum/cmp/Vantage-Utility-Services.html">VANTAGE UTILITY SERVICES</a></span></li></ul></div></div><div class="dya-container result-tab"></div>
    <div class="tellafriend-container result-tab email_job_content"></div>
    <div class="sign-in-container result-tab"></div>
    <div class="notes-container result-tab"></div>
</div>

</td>
</tr>
</tbody></table>
</div>

3 个答案:

答案 0 :(得分:1)

您可以使用

find_elements_by_tag_name('div')

这将返回html中所有div的列表。

答案 1 :(得分:1)

要获得所有可以使用的div:

  

self.driver.find_elements_by_css_selector(&#39; DIV&#39)

还可以通过以下方式从整个页面获取文本:

  

self.driver.find_element_by_css_selector(&#39;主体&#39)。文本

但是获取所有div并使用for循环真是个坏主意。更好地找到适当的数据选择器,你想要从一个页面获得,比如class / id,并从这些元素中获取数据。

答案 2 :(得分:1)

尝试实现以下代码:

div_nodes = driver.find_elements_by_css_selector("div.row.result.clickcard")

如果您需要更具体的选择器,请告诉我