Question

我正在尝试做的事情：

我正在尝试编写一个脚本，用于抓取网站以获取产品信息。

当前，该程序使用for循环抓取产品价格和唯一ID。

for循环包含两个if语句，以阻止其抓取NoneTypes。

import requests
from bs4 import BeautifulSoup


def average(price_list):
    return sum(price_list) / len(price_list)


# Requests search data from Website
page_link = 'URL'
page_response = requests.get(page_link, timeout=5)  # gets the webpage (search) from Website
page_content = BeautifulSoup(page_response.content, 'html.parser')  # turns the webpage it just retrieved into a BeautifulSoup-object

# Selects the product listings from page content so we can work with these
product_listings = page_content.find_all("div", {"class": "unit flex align-items-stretch result-item"})

prices = []  # Creates a list to add the prices to
uids = [] # Creates a list to store the unique ids

for product in product_listings:

## UIDS 
    if product.find('a')['id'] is not None:
        uid = product.find('a')['id']
        uids.append(uid)

# PRICES
    if product.find('p', class_ = 'result-price man milk word-break') is not None:# assures that the loop only finds the prices
        price = int(product.p.text[:-2].replace(u'\xa0', ''))  # makes a temporary variable where the last two chars of the string (,-) and whitespace are removed, turns into int
        prices.append(price)  # adds the price to the list

问题：

在if product.find('a')['id'] is not None:上，我得到一个Exception has occurred: TypeError 'NoneType' object is not subscriptable。

无论如何，如果我运行print(product.find('a')['id'])，我会得到我想要的价值，这让我感到非常困惑。这不是说错误不是NoneType吗？

此外，if product.find('p', class_ = 'result-price man milk word-break') is not None:的工作无懈可击。

我尝试过的事情：

我尝试将if product.find('p', class_ = 'result-price man milk word-break')分配给变量，然后在for循环中运行它，但这没有用。我也对Google搜寻做出了应有的贡献，但没有成功。可能存在的问题是，我对编程还比较陌生，也不知道确切要搜索什么，但是我仍然找到了很多似乎与相关问题有关的答案，但是这些问题对我来说不起作用代码。

任何帮助将不胜感激！

Answer 1

只需执行一个中间步骤：

res = product.find('a')

if res is not None and res['id'] is not None:
    uid = product.find('a')['id']
    uids.append(uid)

这样，如果因为未找到该项目而find返回None，则最终不会尝试对NoneType下标。

Python：“如果i.find（'a'）['id']不是None：”返回TypeError'NoneType'对象不可下标，但是print（）返回一个值

我正在尝试做的事情：

问题：

我尝试过的事情：

1 个答案: