Question

我需要帮助，我想抓这个网站。我正在使用BeautifulSoup和requests，但我无法从图片中获取值。

import requests
from bs4 import BeautifulSoup

my_url = 'https://partneredge.sap.com/content/partnerfinder/search.html#/'
page = requests.get(my_url)
page_soup = BeautifulSoup(page.content, "lxml")

trazenje = page_soup.find_all('header.search-result__head')
print(trazenje)

我得到空列表，结果没有错误！

Link to site

Answer 1

如@abarnert所述，您可能需要使用类似Selenium python bindings的内容来获取该内容：

from selenium import webdriver

driver = webdriver.Chrome()
driver.get('https://partneredge.sap.com/content/partnerfinder/search.html#/')

table = driver.find_elements_by_xpath('//article//header//a')

results = []
for tag in table:
    results.append(tag.text)
print(results)

这会产生以下输出：

['Accenture', 'Capgemini AB', 'Deloitte Inc.', 'IBM Corporation International Technical', 'itelligence AG', 'SEIDOR, S.A.', 'GAVDI A/S', 'Navigator Business Solutions, Inc.', 'Delaware Consulting US Inc.', 'Ernst & Young LLP']

我会说，如果速度是一个因素，这个选项很慢，但它很容易设置。

如何用python 3.6抓取jquery代码？

1 个答案: