向Python的NoneType添加方法

时间:2014-03-10 22:52:01

标签: python beautifulsoup

我正在使用BeautifulSoup进行一些抓取,并希望链接查找调用,例如:

soup.find('div', class_="class1").find('div', class_="class2").find('div', class_="class3")

当然,无论何时找不到其中一个div,抛出

就会中断
AttributeError: 'NoneType' object has no attribute 'find'

有没有办法修改NoneType以添加查找方法,例如

class NoneType:
    def find(*args):
        return None

这样我可以做类似

的事情
thing = soup.find('div', class_="class1").find('div', class_="class2").find('div', class_="class3")
if thing:
    do more stuff

而不是

thing1 = soup.find('div', class_="class1")
if thing1:
    thing2 = thing1.find('div', class_="class2")
    if thing2:
        thing3 = thing2.find('div', class_="class3")
        etc.

我想我可以通过使用具有XPath功能的解析器来做类似的事情,但问题不是特定于这个用例,而是更多关于修改/覆盖内置类。

6 个答案:

答案 0 :(得分:2)

为什么不使用try / except语句(因为你不能修改NoneType)?

try:
    thing = soup.find('div', class_="class1").find('div', class_="class2").find('div', class_="class3")
    do more stuff
except AttributeError:
    thing = None  # if you need to do more with thing

答案 1 :(得分:1)

您无法修改内置类,例如NoneTypestr

>>> nt = type(None)
>>> nt.bla = 23
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can't set attributes of built-in/extension type 'NoneType'

对于其中一些(例如str),您可以继承:

>>> class bla(str):
...      def toto(self): return 1
>>> bla('2123').toto()
1

NoneType无法实现。它也不会帮助你:

>>> class myNoneType(nt):
...      def find(self): return 1
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Error when calling the metaclass bases
    type 'NoneType' is not an acceptable base type

答案 2 :(得分:1)

你不能修改课程,真正的问题是为什么你会尝试? NoneType意味着那里没有数据,所以当你在该类型上尝试.find()时,即使它确实存在,你也只会得到null或没有值。我会推荐这样的事情。

try:
    var = soup.find('div', class_="class1").find('div', class_="class2").find('div', class_="class3")
except AttributeError:
    do something else instead or message saying there was no div

答案 3 :(得分:0)

您无法继承无:

>>> class Noneish(type(None)):
...   pass
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: type 'NoneType' is not an acceptable base type

答案 4 :(得分:0)

方法可能是

class FindCaller(object):
    def __init__(self, *a, **k):
        self.a = a
        self.k = k
    def __call__(self, obj):
        return obj.find(*self.a, **self.k)

def callchain(root, *fcs):
    for fc in fcs:
        root = fc(root)
        if root is None: return
    return root

然后再做

thing = callchain(soup,
    FindCaller('div', class_="class1"),
    FindCaller('div', class_="class2"),
    FindCaller('div', class_="class3"),
)

答案 5 :(得分:0)

你做不到。有充分理由...... 实际上,NoneType甚至比其他内置类型更难访问:

type(None).foo = lambda x: x
# ---------------------------------------------------------------------------
# TypeError                                 Traceback (most recent call last)
# <ipython-input-12-61bbde54e51b> in <module>()
# ----> 1 type(None).foo = lambda x: x

# TypeError: can't set attributes of built-in/extension type 'NoneType'

NoneType.foo = lambda x: x
# ---------------------------------------------------------------------------
# NameError                                 Traceback (most recent call last)
# <ipython-input-13-22af1ed98023> in <module>()
# ----> 1 NoneType.foo = lambda x: x

# NameError: name 'NoneType' is not defined

int.foo = lambda x: x
# ---------------------------------------------------------------------------
# TypeError                                 Traceback (most recent call last)
# <ipython-input-14-c46c4e33b8cc> in <module>()
# ----> 1 int.foo = lambda x: x

# TypeError: can't set attributes of built-in/extension type 'int'

如上所述,请使用try: ... except AttributeError:子句。