Question

我收到了这个错误：

Traceback (most recent call last):
  File "scr.py", line 18, in <module>
    for title in link.find('a'):
TypeError: 'NoneType' object is not iterable

这些是：

for each in soup.find_all(attrs={'class' : 'table table-bordered table-custom'}):
    for link in each.find_all('td'):
        for title in link.find('a'):
            print "\033[1;37m%s" % title.text

我的代码：

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from bs4 import BeautifulSoup
import requests
import sys

if len(sys.argv) < 2:
    print "Falta um paramêtro."

else:
    data = requests.get("http://theanonybay.org/search?q=%s" % sys.argv[1].replace(" ", "%20")).text

    soup = BeautifulSoup(data)

    for each in soup.find_all(attrs={'class' : 'table table-bordered table-custom'}):
        for link in each.find_all('td'):
            for title in link.find('a'):
                print "\033[1;37m%s" % title.text

            for little in link.find_all(attrs={'class' : 'btn btn-flat btn-xs btn-warning'}):
                print "Magnet Link:\033[0;37m", little.get('href'),"\n\n"

Answer 1

您正在使用link.find()，如果找不到该元素，则会始终返回元素或None。

这意味着当前a单元格中没有td个链接。你可能不应该遍历链接对象，无论如何，即使找到它，因为那样你循环遍历元素的内容。

您必须先明确测试是否先找到了任何内容：

a_element = link.find('a') if a_element: # not None, so we can proceed ...

如果您想查找给定表格中的所有链接文本，通常更容易使用CSS selectors;从每一行开始，然后从那里向下钻取以获取链接：

for row in soup.select('.table-custom tr'): link = row.find('a', text=True) if link: print "\033[1;37m%s" % link.get_text(strip=True) for magnet in row.select('a[href^=magnet:]'): print "Magnet Link:\033[0;37m", magnet['href'] print

请注意，不是手动转义搜索查询，而是使用requests参数将转义转义为params。您应该使用response.content属性，并将解码保留给BeautifulSoup;服务器通常不在头文件中包含内容字符集，然后强制要求使用Latin-1，这通常是错误的：

params = {'q': sys.argv[1]} response = requests.get("http://theanonybay.org/search", params=params) soup = BeautifulSoup(response.content)

TypeError：＆＃39; NoneType＆＃39;对象不可迭代 - 我无法找到解决方案

1 个答案: