Question

我试图编写一些代码来扫描与正则表达式相匹配的每个字符串＆＃34; PP +＆＃34;并告诉我它出现了多少次。这是我的代码：

with open ('testfile.txt') as f:
data = f.read()
data = data.split()

import re


the_sum = 0

prolist = []

for word in data:
    pronoun = re.compile(r'PP+')
    result = pronoun.match(data)
    if word == result:
        the_sum += 1

print the_sum

我收到此错误消息：

Traceback (most recent call last):
  File "C:/Python27/RE_counter.py", line 14, in 
    result = pronoun.match(data)
TypeError: expected string or buffer

有人能告诉我我做错了吗？

Answer 1

您在每次迭代中都传递了整个列表（ TypeError ），并且还没有正确检查匹配结果，因为它赢了＆＃39 ; t返回单词：

for word in data:
    pronoun = re.compile(r'PP+')
    result = pronoun.match(word)  # ← you had pronoun.match(data)
    if result is not None:        # ← you had if word == result
        the_sum += 1

Answer 2

你可以直接得到你的东西。

with open ('testfile.txt') as f:
    data = f.read()
    print len(re.findall(r"\bPP\+\b",data))

TypeError使用正则表达式在Python中进行文本分析

2 个答案: