无法使用Selenium和Beatifulsoup在meta标记中找到特定文本

时间:2018-08-08 21:46:04

标签: python html selenium selenium-webdriver beautifulsoup

<div class="product-grid-item  ">

    <meta itemprop="url" content="https://undefeated.com/products/air-humara- 
     17-qs-silver-carotene-bluespark-black">

上面的代码是我要搜索的,单词“ humara”在那里。

from Utilities import custom_logger as cl
from Base.basepage import BasePage
import logging
from bs4 import BeautifulSoup


class NewShoePage(BasePage):

    log = cl.customLogger(logging.DEBUG)

    def __init__(self, driver):
        super().__init__(driver)
        self.driver = driver

    def searchForKeywords(self, keywords):
        html = self.driver.page_source
        soup = BeautifulSoup(html, 'html.parser')
        keywordsFound = soup.find_all('meta', content=keywords)
        print(keywordsFound)

当我打电话时:

searchForKeywords('humara')

它什么也没打印出来

我想在网页上的meta content标签中找到单词'humara',但它什么也没有返回。完成此操作后,我想重定向到该链接。

2 个答案:

答案 0 :(得分:0)

如果您已经在使用Selenium,则不需要BeautifulSoup。

elements = driver.find_element_by_partial_link_text('nike')

答案 1 :(得分:0)

def searchForKeywords(self, keywords):
    html = self.driver.page_source
    soup = BeautifulSoup(html, 'html.parser')
    possibleUrls = soup.find_all('meta', content=re.compile(keywords))
    for meta in possibleUrls:
        print(meta['content'])

上面的代码返回:

https://undefeated.com/products/air-humara-17-qs-silver-carotene-bluespark-black

弄清楚了,我不得不使用re.compile,然后遍历URL并打印出来。