无法使用scrapy刮取数据

时间:2016-07-17 17:12:53

标签: python selenium xpath web-scraping scrapy

我曾使用各种方法从angel.co

中抓取数据

但每次获取空列表时仍无法抓取数据

results = self.driver.find_elements_by_css_selector(".results > div") for result in results: name = result.find_element_by_css_selector(".name") print(name.text)

另一个是

soup = BeautifulSoup(response.body) val = soup.findAll('div.name')

for post in response.xpath('.//div[@class="base startup"]'): item = {} item['title'] =post.xpath('.//div[@class="name"]//text()').extract()[0] print item 这些都是我尝试过的,如果有其他建议然后帮我刮掉页面 完成蜘蛛的链接是 here

1 个答案:

答案 0 :(得分:0)

您需要wait才能加载搜索结果,然后才提取它们:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

wait = WebDriverWait(self.driver, 10)
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".startup")))

results = self.driver.find_elements_by_css_selector(".results > div")
for result in results:
    name = result.find_element_by_css_selector(".name")
    print(name.text)