嘿我是python的新手,我需要一些帮助。我写下了以下代码:
try:
it = iter(cmLines)
line=it.next()
while (line):
if ("INFERNAL1/a" in line) or ("HMMER3/f" in line) :
title = line
line = it.next()
if word2(line) in namesList: //if second word in line is in list
output.write(title)
output.write(line)
line = it.next()
while ("//" not in line):
output.write(line)
line = it.next()
output.write(line)
line = it.next()
except Exception as e:
print "Loop exited becuase:"
print type(e)
print "at " + line
finally:
output.close()
当循环结束时,它总是抛出一个异常,通知循环停止。即使它没有提前终止。我该怎么做?
有没有更好的方法来编写我的代码?更时尚的东西。我有一个包含大量信息的大文件,我只想抓住我需要的信息。每条信息的格式都是:
Infernal1/a ...
Name someSpecificName
...
...
...
...
//
谢谢
答案 0 :(得分:2)
RocketDonkey的回答是现实的。由于迭代方式的复杂性,使用for
循环没有简单的方法,因此您需要明确处理StopIteration
。
但是,如果你重新考虑一下这个问题,还有其他方法可以解决这个问题。例如,一个普通的状态机:
try:
state = 0
for line in cmLines:
if state == 0:
if "INFERNAL1/a" in line or "HMMER3/f" in line:
title = line
state = 1
elif state == 1:
if word2(line) in NamesList:
output.write(title)
output.write(line)
state = 2
else:
state = 0
elif state == 2:
output.write(line)
if '//' in line:
state = 0
except Exception as e:
print "Loop exited becuase:"
print type(e)
print "at " + line
finally:
output.close()
或者,您可以编写一个委托给子生成器的生成器函数(如果您在3.3中,则通过yield from foo()
,如果没有,则通过for x in foo(): yield x
)或其他各种可能性,特别是如果您重新考虑你的问题更上一层楼。
这可能不是你想要做的,但通常值得考虑的是“我可以将此while
循环和两个明确的next
调用转换为for
循环?“,即使答案结果是”不,不是没有让事情变得不那么可读。“
作为旁注,您可以通过将try
/ finally
替换为with
语句来简化操作。而不是:
output = open('foo', 'w')
try:
blah blah
finally:
output.close()
你可以这样做:
with open('foo', 'w') as output:
blah blah
或者,如果output
不是普通文件,您仍然可以用以下内容替换最后四行:
with contextlib.closing(output):
blah blah
答案 1 :(得分:1)
当您致电line = it.next()
时,如果没有任何内容,则会引发StopIteration
例外:
>>> l = [1, 2, 3]
>>> i = iter(l)
>>> i.next()
1
>>> i.next()
2
>>> i.next()
3
>>> i.next()
Traceback (most recent call last):
File "<ipython-input-6-e590fe0d22f8>", line 1, in <module>
i.next()
StopIteration
每次都会在你的代码中发生这种情况,因为你在块的末尾调用它,所以在循环有机会回圈并发现line
为空之前会引发异常。作为一个创可贴修复,你可以做这样的事情,你抓住StopIteration
异常并传递它(因为这表明它已经完成):
# Your code...
except StopIteration:
pass
except Exception as e:
print "Loop exited becuase:"
print type(e)
print "at " + line
finally:
output.close()
答案 2 :(得分:0)
我喜欢Parser Combinators,因为它们会带来更具声明性的编程风格。
例如使用Parcon库:
from string import letters, digits
from parcon import (Word, Except, Exact, OneOrMore,
CharNotIn, Literal, End, concat)
alphanum = letters + digits
UntilNewline = Exact(OneOrMore(CharNotIn('\n')) + '\n')[concat]
Heading1 = Word(alphanum + '/')
Heading2 = Word(alphanum + '.')
Name = 'Name' + UntilNewline
Line = Except(UntilNewline, Literal('//'))
Lines = OneOrMore(Line)
Block = Heading1['hleft'] + Heading2['hright'] + Name['name'] + Lines['lines'] + '//'
Blocks = OneOrMore(Block[dict]) + End()
然后,使用Alex Martelli's Bunch
class:
class Bunch(object):
def __init__(self, **kwds):
self.__dict__.update(kwds)
names = 'John', 'Jane'
for block in Blocks.parse_string(config):
b = Bunch(**block)
if b.name in names and b.hleft.upper() in ("INFERNAL1/A', 'HMMER3/F"):
print ' '.join((b.hleft, b.hright))
print 'Name', b.name
print '\n'.join(b.lines)
鉴于此文件:
Infernal1/a ...
Name John
...
...
...
...
//
SomeHeader/a ...
Name Jane
...
...
...
...
//
HMMER3/f ...
Name Jane
...
...
...
...
//
Infernal1/a ...
Name Billy Bob
...
...
...
...
//
结果是:
Infernal1/a ...
Name John
...
...
...
...
HMMER3/f ...
Name Jane
...
...
...
...
答案 3 :(得分:0)
1 /无异常处理
为了避免处理异常StopIteration
,你应该看一下处理序列的Pythonic方法(如Abarnert所提到的):
it = iter(cmLines)
for line in it:
# do
2 /抓取信息
此外,您可能会尝试使用正则表达式捕获您的信息模式。你知道第一行的确切表达式。然后,您希望捕获名称并将其与某些可容许名称列表进行比较。最后,您正在寻找下一个//
。您可以构建一个包含换行符的正则表达式,并使用一个组来捕获您要检查的名称,
(...)
匹配括号内的正则表达式, 表示组的开始和结束;一组的内容 可以在执行匹配后检索,并且可以匹配 稍后在带有\ number特殊序列的字符串中进行描述 下面。要匹配文字'('或')',请使用(或)或将它们括起来 在一个字符类中:[(] [)]。
以下是有关在Python doc
中使用group的正则表达式的示例>>> m = re.match(r"(\w+) (\w+)", "Isaac Newton, physicist")
>>> m.group(0) # The entire match
'Isaac Newton'
>>> m.group(1) # The first parenthesized subgroup.
'Isaac'
>>> m.group(2) # The second parenthesized subgroup.
'Newton'
>>> m.group(1, 2) # Multiple arguments give us a tuple.
('Isaac', 'Newton')
有关Regex的更多信息。
<强> 链接 强>
Iterator next()在Python中引发异常:https://softwareengineering.stackexchange.com/questions/112463/why-do-iterators-in-python-raise-an-exception
答案 4 :(得分:0)
您可以明确地忽略StopIteration
:
try:
# parse file
it = iter(cmLines)
for line in it:
# here `line = next(it)` might raise StopIteration
except StopIteration:
pass
except Exception as e:
# handle exception
或致电line = next(it, None)
并检查None
。
要分开关注点,您可以将代码分为两部分:
from collections import deque
from itertools import chain, dropwhile, takewhile
def getrecords(lines):
it = iter(lines)
headers = "INFERNAL1/a", "HMMER3/f"
while True:
it = chain([next(it)], it) # force StopIteration at the end
it = dropwhile(lambda line: not line.startswith(headers), it)
record = takewhile(lambda line: not line.starswith("//"), it)
yield record
consume(record) # make sure each record is read to the end
def consume(iterable):
deque(iterable, maxlen=0)
from contextlib import closing
with closing(output):
for record in getrecords(cmLines):
title, line = next(record, ""), next(record, "")
if word2(line) in namesList:
for line in chain([title, line], record):
output.write(line)