Question

如何获取xpath以从此html行提取标题。

没有任何用处，因为cssClass会随时间变化，因此代码可能会中断。我认为既然此标签中的href和text都是我要提取的名称，则可以使用相等条件。

<a class="FPmhX notranslate nJAzx" title="ceorackz_adpp" href="/ceorackz_adpp/">ceorackz_adpp</a>

我希望使用selenium API调用或常规正则表达式兼容python代码，以获取此定位标记的标题或文本。

Answer 1

使用以下列表中的任何xpath：

//a[@title='ceorackz_adpp']

//a[text()='ceorackz_adpp']

//a[@title='ceorackz_adpp' and text()='ceorackz_adpp']

Answer 2

要从元素中提取标题，即 ceorackz_adpp ，您必须为visibility_of_element_located()引入 WebDriverWait ，并且可以使用以下任一解决方案：

使用CSS_SELECTOR：

print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "a.notranslate[href='/ceorackz_adpp/']"))).get_attribute("title"))

使用LINK_TEXT：

print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.LINK_TEXT, "ceorackz_adpp"))).get_attribute("title"))

使用XPATH：

print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//a[contains(@class, 'notranslate') and @href='/ceorackz_adpp/']"))).get_attribute("title"))

注意：您必须添加以下导入：

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Answer 3

右键单击“检查”部分中的class Signature { constructor() { this.color = "#000000"; this.sign = false; this.begin_sign = false; this.width_line = 5; this.canvas = document.getElementById('canvas'); this.cursorX, this.cursorY; this.context = canvas.getContext('2d'); this.context.lineJoin = 'round'; this.context.lineCap = 'round'; this.whenMouseDown(); .... } whenMouseDown() { document.addEventListener("mousedown", ({ pageX, pageY }) => { this.sign = true; this.cursorX = (pageX - this.offsetLeft); this.cursorY = (pageY - this.offsetTop); }) } ... } document.addEventListener("DOMContentLoaded", event => { new Signature(); });元素。然后转到HTML。然后使用此代码

Copy > Copy XPath

Answer 4

我不太确定，但我猜想，也许是这样的表达：

title="(.+?)">\s*(.+?)\s*<

可能是一个起点。

Demo

测试

import re

regex = r"title=\"(.+?)\">\s*(.+?)\s*<"

test_str = "<a class=\"FPmhX notranslate nJAzx\" title=\"ceorackz_adpp\" href=\"/ceorackz_adpp/\">ceorackz_adpp</a>"

matches = re.finditer(regex, test_str, re.DOTALL)

for matchNum, match in enumerate(matches, start=1):

    print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))

    for groupNum in range(0, len(match.groups())):
        groupNum = groupNum + 1

        print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))

如何从给定的锚标签中提取标题

4 个答案:

Demo

测试