Question

首先，我希望这个问题的范围不太一般 - 如果是这样，我道歉。

我正在使用Selenium为Python 2.7构建一个网络抓取工具。以前，我使用“静态”XPaths将其指向某些元素。我想实现一个可以在上下文中找到元素的解决方案（相对于其他元素）。

假设我们希望从此页面上的“ Issuer：”标签后面的兄弟元素中获取文本：http://etfdb.com/etf/ROBO/。在这种情况下，相邻的文本是“ Exchange Traded Concepts ”。

根据我收集的内容，可以使用多种技术，包括亲戚XPath，CSS或DOM（？）。

这是一个明智的方法吗？如果可能，请用代码演示。

当前的“静态”XPath ，已经为Firefox确定了带FirePath的XPath：

try:
    xpath_issuer = ".//*[@id='overview']/div/div[2]/div/div[1]/ul[1]/li[1]/span[2]/a"
    find_issuer = driver.find_element_by_xpath(xpath_issuer)
    issuer = re.search(r"(.+)", find_issuer.text).group().encode("utf-8")
    print "Issuer: %s" % issuer
    break
except NoSuchElementException:
    pass

Answer 1

您可以使用以下xpath

var
  x:array of OleVariant;
  i:integer;
begin
  SetLength(x,AInputList.Count);
  for i:=0 to AInputListCount-1 do
    x[i]:=JSON(
      ['idrating',AInputList[i].IdRating
      ,'idmark',AInputList[i].IdMark
      ,'value',AInputList[i].Value
      ,'description',AInputList[i].Description
      ,'timeposted',FormatDateTime('yyyy-mm-dd hh:mm:ss', AInputList[i].TimePosted)//VarFromDateTime?
      ]);
  end;
  AResponse:=JSON(['ratings',VarArrayOf(x)]);

Answer 2

我会创建一个可重用的函数来按键获取值：

“发行人”的“交易所交易概念”
“ETF”by“Structure”等

工作实施示例：

val file = getClass.getResource("employees.csv").getFile

val employees: ValidationNel[Throwable, Seq[Employee]] =
  CSVReader[Employee]
  .readCSVFromFileName(file)
  .traverseU(_.toValidation)

val averageSalary = (_: Seq[Employee])
  .groupBy(emp => corresponding[Group].move(emp))
  .mapValues {_
    .map(emp => BigDecimal(emp.salary))
    .qmean
  }

println(employees map averageSalary)

用法：

from selenium.common.exceptions import NoSuchElementException

def get_value(driver, key):
    key = key + ":"
    try:
        return driver.find_element_by_xpath("//span[@class='minimal-list__title' and . = '%s']/following-sibling::span" % key).text
    except NoSuchElementException:
        print "Not Found"
        return None

Selenium：查找相邻元素

2 个答案: