有没有一种方法可以反向查找元素的XPath usig硒

时间:2019-09-09 22:49:11

标签: python selenium selenium-webdriver web-scraping

这么多听起来很愚蠢,但是在通过Selenium找到元素后,我需要知道它的XPath。原因是当我寻找该元素时,我使用了文本搜索,因此,我不知道确切的XPath,可以用来获取上述元素的同级元素。即使不是Selenium,也可以通过BeautifulSoup这样的间接方式完成此任务,那还是很棒的。

我的程序的当前输出如下:

(Pdb) browser.find_elements_by_xpath('//*[contains(text(), "5StarMAX")]')

[<selenium.webdriver.remote.webelement.WebElement (session="8994add1f6f087a917bbb33f69f15f7c", element="7bad823c-1f3
e-445b-9a47-6d934fcacb8a")>, <selenium.webdriver.remote.webelement.WebElement (session="8994add1f6f087a917bbb33f69f1
5f7c", element="551df9b7-d2cb-4021-bb30-a2723c835adf")>, <selenium.webdriver.remote.webelement.WebElement (session="
8994add1f6f087a917bbb33f69f15f7c", element="b864c44b-8220-4010-843c-fbf0cfa1ba13")>, <selenium.webdriver.remote.webe
lement.WebElement (session="8994add1f6f087a917bbb33f69f15f7c", element="59d9f40d-d318-4e0d-9ab1-aa9df42d037c")>, <se
lenium.webdriver.remote.webelement.WebElement (session="8994add1f6f087a917bbb33f69f15f7c", element="260795bd-e7c6-43
b1-a8f0-10b36eb69787")>, <selenium.webdriver.remote.webelement.WebElement (session="8994add1f6f087a917bbb33f69f15f7c
", element="4e46be00-4578-4741-adc9-a5b6fc67a3e9")>, <selenium.webdriver.remote.webelement.WebElement (session="8994
add1f6f087a917bbb33f69f15f7c", element="df66abcb-bd99-4670-af07-404c085afb28")>]

如您所见,我找到了元素,但是我想以编程方式(使用Python)搜索它的兄弟姐妹。至少(据我所知(试图使用开发人员工具在页面上的所有位置找到它之后))无法知道元素本身的XPath。

感谢您的提示/建议。

2 个答案:

答案 0 :(得分:2)

尝试同时使用PackageStatusReceiver.javapublic class PackageStatusReceiver extends BroadcastReceiver implements InstallReferrerStateListener { protected static final String LOG_TAG = PackageStatusReceiver.class.getSimpleName(); private InstallReferrerClient referrerClient; @Override public void onReceive(Context context, Intent intent) { if(intent.getAction() != null) { if(intent.getAction().equals(Intent.ACTION_PACKAGE_FIRST_LAUNCH)) { this.referrerClient = InstallReferrerClient.newBuilder(context).build(); this.referrerClient.startConnection(this); } } } @Override public void onInstallReferrerSetupFinished(int responseCode) { switch (responseCode) { case InstallReferrerClient.InstallReferrerResponse.OK: Log.d(LOG_TAG, "InstallReferrer Response.OK"); try { ReferrerDetails response = referrerClient.getInstallReferrer(); String referrer = response.getInstallReferrer(); long clickTimestamp = response.getReferrerClickTimestampSeconds(); long installTimestamp = response.getInstallBeginTimestampSeconds(); Log.d(LOG_TAG, "InstallReferrer " + referrer); referrerClient.endConnection(); } catch (RemoteException e) { Log.e(LOG_TAG, "" + e.getMessage()); } break; case InstallReferrerClient.InstallReferrerResponse.FEATURE_NOT_SUPPORTED: Log.w(LOG_TAG, "InstallReferrer Response.FEATURE_NOT_SUPPORTED"); break; case InstallReferrerClient.InstallReferrerResponse.SERVICE_UNAVAILABLE: Log.w(LOG_TAG, "InstallReferrer Response.SERVICE_UNAVAILABLE"); break; case InstallReferrerClient.InstallReferrerResponse.SERVICE_DISCONNECTED: Log.w(LOG_TAG, "InstallReferrer Response.SERVICE_DISCONNECTED"); break; case InstallReferrerClient.InstallReferrerResponse.DEVELOPER_ERROR: Log.w(LOG_TAG, "InstallReferrer Response.DEVELOPER_ERROR"); break; } } @Override public void onInstallReferrerServiceDisconnected() { Log.w(LOG_TAG, "InstallReferrer onInstallReferrerServiceDisconnected()"); } }

您可以使用this solution来尝试xpath_soup()

选项-A

BeautifulSoup

输出

Selenium

选项B

引用import re import itertools from bs4 import BeautifulSoup html = '<html><body><div><p>Hello World</p></div></body></html>' soup = BeautifulSoup(html, 'lxml') elem = soup.find(string=re.compile('Hello World')) xpath_soup(elem) 。您可能需要在这里和那里进行一些更改,以使其适合您。 source

'/html/body/div/p'

也请检查以下内容:

  1. https://qxf2.com/blog/auto-generate-xpaths-using-python/
  2. https://gist.github.com/ergoithz/6cf043e3fdedd1b94fcf

答案 1 :(得分:1)

找到父元素后,就不需要它的XPath了。您可以链接.find_element_*()个呼叫。例如,下面的代码将找到包含文本“ 5StarMAX”的元素,然后第二个find调用将找到第一个元素的子DIV。

请注意,第二个XPath以.开头。这意味着从第一个元素element开始搜索。

element = browser.find_element_by_xpath('//*[contains(text(), "5StarMAX")]')
element.find_element_by_xpath('./div')

如果有帮助...使用单个XPath表示将两个结果组合在一起,则看起来像

//*[contains(text(), "5StarMAX")]/div