我一直收到错误:
Traceback (most recent call last):
File "ba.py", line 13, in <module>
matchObj = re.match(r"^(\w+ \w+) batted (\d+) times with (\d+) hits and (\d+) runs", line)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 137, in match
return _compile(pattern, flags).match(string)
TypeError: expected string or buffer
line
应使用line.strip
阅读文件中的每一行
re.match
使用正则表达式查找字符串中3个组(players
,hits
,atBats
)的匹配项
matchObj.group()
应该阅读每个论坛并将统计信息放在 playerStats{}
词典
如何让re.match将类型属性设置为matchObj,以便我可以使用group()并添加到playerStats()?
import re, sys, os
if len(sys.argv) < 2:
sys.exit("Usage: %s filename" % sys.argv[0])
filename = sys.argv[1]
if not os.path.exists(filename):
sys.exit("Error: File '%s' not found" % sys.argv[1])
playerStats = {'players': (0, 0, 0)}
matchObj = re.match(r"^(\w+ \w+) batted (\d+) times with (\d+) hits and (\d+) runs", line)
with open(filename) as f:
for line in f:
line = line.strip()
if player in playerStats:
playerStats[players][0] += atBat
playerStats[players][1] += hit
if player not in players:
player = matchObj.group(1)
playerStats[players][0] = atBat
playerStats[players][1] = hit
avgs = 0
else:
playerStats[players] = player
playerStats[players][0] = atBat
playerStats[players][1] = hit
playerStats[players][2] = 0
try:
player = matchObj.group(1)
atBat = matchObj.group(2)
hit = matchObj.group(3)
except AttributeError as ae:
print str(ae), "\skipping line:", line
except IndexError as ie:
print str(ie), "\skipping line:", line
#calculates averages
for players in playerStats:
avgs[player] = round(float(hits[player])/float(atBats[player]), 3)
print "%s: %.3f" % (player, avgs[player])
答案 0 :(得分:1)
您正在阅读整个文件。您得到该错误,因为line是一个列表而不是字符串或缓冲区。如果循环遍历每一行,则将条带放入for循环中。以下示例可帮助您入门。
with open(filename) as f:
for line in f:
line = line.strip()
matchObj = re.match(r"^(\w+ \w+) batted (\d+) times with (\d+) hits and (\d+) runs", line)
#Rest of your code here. Also Use try except to catch AttributeError and IndexError
try:
player = matchObj.group(1)
atBat = matchObj.group(2)
hit = matchObj.group(3)
#Other stuff
except AttributeError as ae:
print str(ae), "\skipping line:", line
except IndexError as ie:
print str(ie), "\skipping line:", line
除非您在文本文件中显示示例数据,否则我无法说出您的正则表达式是否准确。
<强>更新强>: 这是一个基于您的评论和更新的工作版本。您可以根据需要随意修改:
#hard code file name for simplicity
filename = "in.txt"
#Sample item: 'players': (0, 0, 0)
playerStats = {}
with open(filename) as f:
for line in f:
line = line.strip()
#Match object should be here, after you read the line
matchObj = re.match(r"^(\w+ \w+) batted (\d+) times with (\d+) hits and (\d+) runs", line)
try:
player = matchObj.group(1)
atBat = matchObj.group(2)
hit = matchObj.group(3)
runs = matchObj.group(4)
#Bad indent - Fixed
#You should put data against player, there is no players variable.
#initialize all stats to 0
if player not in playerStats:
playerStats[player] = [0, 0, 0]
playerStats[player][0] += int(atBat)
playerStats[player][1] += int(hit)
playerStats[player][2] += int(runs)
except AttributeError as ae:
print str(ae), "skipping line:", line
except IndexError as ie:
print str(ie), "skipping line:", line
#calculates average hits
avgs = {}
for player in playerStats:
hitsOfplayer = playerStats[player][1]
atBatsOfPlayer = playerStats[player][0]
avgs[player] = round(float(hitsOfplayer)/float(atBatsOfPlayer), 3)
print "%s: %.3f" % (player, avgs[player])
in.txt的内容:
Mr X batted 10 times with 6 hits and 50 runs
Mr Y batted 12 times with 1 hits and 150 runs
Mr Z batted 10 times with 4 hits and 250 runs
Mr X batted 3 times with 0 hits and 0 runs
junk data
junk data 2
输出:
'NoneType' object has no attribute 'group' skipping line: junk data
'NoneType' object has no attribute 'group' skipping line: junk data 2
Mr Y: 0.083
Mr X: 0.462
Mr Z: 0.400