Question

我正试图为多个网站制作价格刮板，但是我遇到了一个特定网站的问题。检查价格时，小数点显示为“ 56”，但是当我使用BeautifulSoup下载HTML时，它返回16。对于其他产品，也会出现相同的问题。

到目前为止，您可以在下面看到我的代码：

from bs4 import BeautifulSoup
import requests

myProxy = {"http"  : "http://10.120.118.49:8080", "https"  : 
"https://10.120.118.49:8080"}

url = 'https://shop.rewe.de/coca-cola-4x1-5l/PD6731201'

page = requests.get(url, proxies = myProxy)
soup = BeautifulSoup(page.text, 'html.parser')

predecimal = soup.find('span', attrs={'class': 'pd-price__predecimal'})
predec = predecimal.text.strip()

separator = soup.find('span', attrs={'class': 'pd-price__separator'})
sep = separator.text.strip()

decimal = soup.find('span', attrs={'class': 'pd-price__decimal'})
dec = decimal.text.strip()

price = str(predec) + str(sep) + str(dec)

print(price)

上面的代码返回5,16，而网站上显示的价格为5,56。对于其他一些网站，我成功地使用了Selenium，但是在这种情况下，它仍然返回相同的数字。任何帮助将不胜感激！

Answer 1

如前所述，我还使用以下代码尝试了Selenium：

driver.get('https://shop.rewe.de/coca-cola-4x1-5l/PD6731201')
xpath = '//*[@class="pd-PriceInformation"]/mark'
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, xpath)))
price = driver.find_elements_by_xpath(xpath)
for p in price:
    print(p.text)

这将返回“ ab 5，16€”，因此与使用BeautifulSoup基本上相同。当为网站https://shop.rewe.de/vio-bio-limo-orange-4x1l/PD2455462尝试相同的代码时，它返回“ ab 5，96€”，这也是错误的，因为在这种情况下价格也为5,56。

检查元素与BeautifulSoup / Selenium返回的元素不同

1 个答案: