我曾使用各种方法从angel.co
中抓取数据但每次获取空列表时仍无法抓取数据
results = self.driver.find_elements_by_css_selector(".results > div")
for result in results:
name = result.find_element_by_css_selector(".name")
print(name.text)
另一个是
soup = BeautifulSoup(response.body)
val = soup.findAll('div.name')
for post in response.xpath('.//div[@class="base startup"]'):
item = {}
item['title'] =post.xpath('.//div[@class="name"]//text()').extract()[0]
print item
这些都是我尝试过的,如果有其他建议然后帮我刮掉页面
完成蜘蛛的链接是
here
答案 0 :(得分:0)
您需要wait才能加载搜索结果,然后才提取它们:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
wait = WebDriverWait(self.driver, 10)
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".startup")))
results = self.driver.find_elements_by_css_selector(".results > div")
for result in results:
name = result.find_element_by_css_selector(".name")
print(name.text)