findall错误 - NoneType'对象没有属性'findall'

时间:2017-07-26 03:45:45

标签: python python-3.x web-scraping beautifulsoup web-crawler

我不断收到错误“缺少1个必需的位置参数:'section_url'”

每当我尝试使用findall时,都会收到此错误。

学习python的新手,所以非常感谢任何帮助!

from bs4 import BeautifulSoup
import urllib3


def extract_data():

    BASE_URL = "http://www.chicagotribune.com/dining/ct-chicago-rooftops-patios-eat-drink-outdoors-near-me-story.html"

    http = urllib3.PoolManager()
    r = http.request('GET', 'http://www.chicagotribune.com/dining/ct-chicago-rooftops-patios-eat-drink-outdoors-near-me-story.html')
    soup = BeautifulSoup(r.data, 'html.parser')
    heading = soup.find("div", "strong")
    category_links = [BASE_URL + p.a['href'] for p in heading.findAll('p')]
    return category_links
    print(soup)


extract_data()

2 个答案:

答案 0 :(得分:1)

建立在接受的答案的答案上,我认为这就是你想要的

from bs4 import BeautifulSoup
import urllib3

def extract_data():

    BASE_URL = "http://www.chicagotribune.com/dining/ct-chicago-rooftops-patios-eat-drink-outdoors-near-me-story.html"

    http = urllib3.PoolManager()
    r = http.request('GET', 'http://www.chicagotribune.com/dining/ct-chicago-rooftops-patios-eat-drink-outdoors-near-me-story.html')
    soup = BeautifulSoup(r.data, 'html.parser')
    heading = soup.select('div strong')
    print(heading)
    category_links = [BASE_URL + p.a['href'] for p in [i for i, x in enumerate(heading) if x == "p"]]
    return category_links


print(extract_data())

答案 1 :(得分:0)

通常,NoneType object has no attribute类错误意味着上游函数返回None,然后您没有检查它并尝试访问其方法:

stuff = get_stuff()  # this returns None
stuff.do_stuff()  # this crashes

最有可能的是,图书馆无法找到soup.find的标题。请尝试使用soup.select('div.strong')

有关选择器的更多信息: https://www.crummy.com/software/BeautifulSoup/bs4/doc/#css-selectors

有关NoneType的更多信息: https://docs.python.org/3.6/library/constants.html#None