Question

我正在尝试使用python / selenium从此页面（https://www.artprice.com/artist/844/hans-arp/lots/pasts）中获取变量列表（日期，大小，中号等）。

对于标题，它很容易使用：

titles = driver.find_elements_by_class_name("sln_lot_show")
      for title in titles:
          print(title.text)

但是其他变量似乎是源代码中的文本，没有可识别的id或类。

例如，要获取我尝试过的日期：

dates_made = driver.find_elements_by_xpath("//div[@class='col-sm-6']/p[1]")
          for date_made in dates_made:
              print(date_made.get_attribute("date"))

和

dates_made = driver.find_elements_by_xpath("//div[@class='col-sm-6']/p[1]/date")
           for date_made in dates_made:
               print(date_made.text)

这两个都不会产生错误，但是不会显示任何结果。

此文本是否有某种方式，没有特定的类或ID？

此处是特定的html：

......

<div class="col-xs-8 col-sm-6">
  <p>
   <i><a id="sln_16564482" class="sln_lot_show" href="/artist/844/hans-arp/print-multiple/16564482/vers-le-blanc-infini" title="&quot;Vers le Blanc Infini&quot;" ng-click="send_ga_event_now('artist_past_lots_search', 'select_lot_position', 'title', {eventValue: 1})">
        "Vers le Blanc Infini"
   </a></i>
   <date>
    (1960)
   </date>
  </p>
  <p>
   Print-Multiple, Etching, aquatint,
    <span ng-show="unite_to == 'in'" class="ng-hide">15 3/4 x 18 in</span>
    <span ng-show="unite_to == 'cm'">39 x 45 cm</span>
  </p>

Answer 1

渐进模式，在Javascript下面将返回二维数组（很多和详细信息-0、1、2、8、9您的索引）：

lots = driver.execute_script("[...document.querySelectorAll(".lot .row")].map(e => [...e.querySelectorAll("p")].map(e1 => e1.textContent.trim()))")

经典模式：

lots = driver.find_elements_by_css_selector(".lot .row")
for lot in lots:
    lotNo = lot.find_element_by_xpath("./div[1]/p[1]").get_attribute("textContent").strip()
    title = lot.find_element_by_xpath("./div[2]/i").get_attribute("textContent").strip()
    details = lot.find_element_by_xpath("./div[2]/p[2]").get_attribute("textContent").strip()
    date = lot.find_element_by_xpath("./div[3]/p[1]").get_attribute("textContent").strip()
    country = lot.find_element_by_xpath("./div[3]/p[2]").get_attribute("textContent").strip()

如何使用Selenium / python获取没有类/ ID的文本？

1 个答案: