BeautifulSoup解析返回None

时间:2016-07-30 17:21:45

标签: python parsing web-scraping beautifulsoup web-crawler

当我从网站申请价值时,有很多网站只返回None

示例:

import requests
from bs4 import BeautifulSoup

def spider():
    url = 'https://poloniex.com/exchange'
    source_code = requests.get(url)
    plain_text = source_code.text
    soup = BeautifulSoup(plain_text, "html.parser")
    High = soup.findAll("div", {"class": "high info"})[0].string
    print High
    # returns None    

spider()

我该如何解决这个问题?拜托,我需要的只是一个价值。

2 个答案:

答案 0 :(得分:0)

网页包含JavaScript代码,因此请求不会返回完整的结果(在这种情况下,JS代码是完成页面所必需的。)

我正在使用硒来解决这类问题。

答案 1 :(得分:0)

从此链接http://chromedriver.storage.googleapis.com/index.html?path=2.24/下载chromedriver并将其解压缩&将chromedriver.exe放在C:\ Python27 \ Scripts

试试这段代码:

from selenium import webdriver
import time
from bs4 import BeautifulSoup


driver = webdriver.Chrome()
url= "https://poloniex.com/exchange"
driver.maximize_window()
driver.get(url)

time.sleep(5)
content = driver.page_source.encode('utf-8').strip()
soup = BeautifulSoup(content,"html.parser")
High = soup.findAll("div", {"class": "high info"})[0].string
print High
driver.quit()

它将打印:

0.02410000

希望这有帮助