Question

我是python的新手，我有正则表达式和csv写的问题。

这是代码

import re
import csv

match_text = re.compile(r'[^(<a href="/employer/\d+?\>)].+[^(\s</a>)]', re.UNICODE)
match_hh_company_url = re.compile(r'/employer/\d+', re.UNICODE)

def listing_employers():
    ...
    ## making a list with some data
    ...
    source_list = match_page.findall(data)
    source_list = set(source_list)
    return source_list

def w_to_file(f_name, source_list):
    with open(f_name, 'wb') as csvfile:
        bankwriter = csv.writer(csvfile, dialect='excel')
        for each in source_list:
            bankwriter.writerow(match_text.findall(each) + match_hh_company_url(each))

w_to_file('bank_base', listing_employers())

一切都很好，但是连接match_text.findall(each) + match_hh_company_url(each)会有问题 - 它会引发错误：

Traceback (most recent call last):
  File "salebase.py", line 41, in <module>
    w_to_file('bank_base', listing_employers())
  File "salebase.py", line 33, in w_to_file
    bankwriter.writerow(match_text.findall(each) + match_hh_company_url(each))
TypeError: '_sre.SRE_Pattern' object is not callable

没关系，如果我只使用一个match_，但不能使用+，那很奇怪。我试图在shell中建模情况，它可以工作!!：

>>> import re
>>> m = re.compile('\d{3}')
>>> n = re. compile('[a-z]')
>>> a = ['111 a', 'a333', 'f444b']
>>> for x in a:
...     print m.findall(x)
... 
['111']
['333']
['444']
>>> for x in a:
...     print m.findall(x) + n.findall(x)
... 
['111', 'a']
['333', 'a']
['444', 'f', 'b']
>>>

如果这很重要，我正在通过匹配表达式与文本块进行source_list并将其放在列表中。这就是listed_employers（）函数的含义。完整代码：http://pastebin.com/bimhfAtn

所以有人可以帮助我，问题是什么？

Answer 1

您应该在该对象上调用findall：

match_hh_company_url.findall(each)

<强>演示：

>>> import re
>>> r = re.compile(r'')
>>> r()
Traceback (most recent call last):
    r()
TypeError: '_sre.SRE_Pattern' object is not callable

>>> r.findall('')
['']

如果与另一个对象连接，则_sre.SRE_Pattern对象不可调用

1 个答案: