我的文件包含这样的字符串:
N1109 X62.729 Y23.764 Z231.442 A59.756 B9.231
所以我想从这个文件中拆分char和整数。输出应该像我这样:
N 1109 X 62.729 Y 23.764 Z 231.442 A 59.756 B 9.231
这是一个文本文件。我不知道如何从文本文件中执行此操作。
我为此写的代码是:import re
from sys import argv
script, filename = argv
f = open(filename,"r")
lines = f.readlines()
print lines
r = re.compile("([a-zA-Z]+)([0-9]+)")
a = [r.match(string).group() for string in lines]
print a
当我使用group()
时出现此错误:
`AttributeError: 'NoneType' object has no attribute 'group'`
当我删除group()
时输出为:
[<_sre.SRE_Match object at 0xb72f1b18>, None, None, None, None, None, None, None, None, None, None]
请帮帮我,我是python的新手......
答案 0 :(得分:0)
问题是match
只会从头开始搜索,然后停止:
如果字符串开头的零个或多个字符匹配 正则表达式模式,返回相应的MatchObject 实例
您需要使用findall
:
>>> i
'N1109 X62.729 Y23.764 Z231.442 A59.756 B9.231'
>>> re.findall(r'(\w{1})(\d+\.?\d+)', i)
[('N', '1109'), ('X', '62.729'), ('Y', '23.764'), ('Z', '231.442'), ('A', '59.756'), ('B', '9.231')]
另外,请考虑使用with
语句,该语句将为您处理文件的关闭:
import re
import sys
exp = r'(\w{1})(\d+\.?\d+)'
with open(sys.argv[1]) as f:
for line in f:
for letter,number in re.findall(exp, line):
print('{} {}'.format(letter, number))
此外,您的原始表达式"([a-zA-Z]+)([0-9]+)"
没有考虑数字的可选.
部分 - 您的表达式是&#34;一个或多个字母字符,无论情况如何接着是一个或多个数字&#34; ,你需要的表达式是&#34;一个或多个字母字符,后跟一个或多个数字,一个可选的.
,然后一个或更多数字&#34; 。
答案 1 :(得分:0)
您可以使用re
模块来实现此目的。
试试这个,这可能会对你有帮助。
import re
>>> match = re.match(r"([a-z]+)([0-9]+)", 'N1109', re.I)
>>> if match:
print match.groups()
Output:
('N', '1109')
<强>更新强>
>>> a=['N1109', 'X62.729', 'Y23.764', 'Z231.442', 'A59.756', 'B9.231']
>>> answer=[]
>>> for i in a:
match = re.match(r"([a-z]+)([0-9]*\.?[0-9]+)", i, re.I)
if match:
answer.append(match.groups())
>>> answer
[('N', '1109'), ('X', '62.729'), ('Y', '23.764'), ('Z', '231.442'), ('A', '59.756'), ('B', '9.231')]
>>>
>>> with open(r'd:\test1.txt') as f:
content = f.readlines()
>>> content=' '.join(content)
>>> content=content.split()
>>> answer=[]
>>> for i in content:
match = re.match(r"([a-z]+)([0-9]*\.?[0-9]+)", i, re.I)
if match:
answer.append(match.groups())
>>> answer
[('N', '1100'), ('X', '63.658'), ('Y', '21.066'), ('Z', '230.989'), ('A', '60.28'), ('B', '9.5'), ('N', '1101'), ('X', '63.424'), ('Y', '21.419'), ('Z', '231.06'), ('A', '60.269'), ('B', '9.459'), ('N', '1102'), ('X', '63.219'), ('Y', '21.805'), ('Z', '231.132'), ('A', '60.231'), ('B', '9.418'), ('N', '1103'), ('X', '63.051'), ('Y', '22.206'), ('Z', '231.202'), ('A', '60.169'), ('B', '9.377'), ('N', '1104'), ('X', '62.915'), ('Y', '22.63'), ('Z', '231.272'), ('A', '60.083'), ('B', '9.335'), ('N', '1105'), ('X', '62.863'), ('Y', '22.851'), ('Z', '231.307'), ('A', '60.027'), ('B', '9.314'), ('N', '1106'), ('X', '62.811'), ('Y', '23.073'), ('Z', '231.341'), ('A', '59.971'), ('B', '9.293'), ('N', '1111'), ('X', '62.702'), ('Y', '24.227'), ('Z', '231.506'), ('A', '59.596'), ('B', '9.191'), ('N', '1112'), ('X', '62.71'), ('Y', '24.462'), ('Z', '231.536'), ('A', '59.503'), ('B', '9.172'), ('N', '1113'), ('X', '62.718'), ('Y', '24.697'), ('Z', '231.567'), ('A', '59.41'), ('B', '9.152'), ('N', '1114'), ('X', '62.727'), ('Y', '24.932'), ('Z', '231.597'), ('A', '59.316'), ('B', '9.133'), ('N', '1115'), ('X', '62.734'), ('Y', '25.167'), ('Z', '231.627'), ('A', '59.222'), ('B', '9.114'), ('N', '1123'), ('X', '62.793'), ('Y', '27.037'), ('Z', '231.864'), ('A', '58.46'), ('B', '8.961')]
>>>