我被逻辑上的一小部分所震惊
我的代码在这里
def find_between( string, first, last ):
list1 = []
try:
start = string.index( first ) + len( first )
end = string.index( last, start )
list1.append(string[start:end])
print(list1)
except ValueError:
return ""
with open("sample.txt")as f:
data = f.read()
print(data)
find_between( data, "*CHI: " , "%mor: " )
我的sample.txt包含:
*CHI: I saw a giraffe and a elephant .
%mor: pro:sub|I v|see&PAST det:art|a n|giraffe coord|and det:art|a
n|elephant .
%gra: 1|2|SUBJ 2|0|ROOT 3|4|DET 4|2|OBJ 5|4|CONJ 6|7|DET 7|5|COORD 8|2|PUNCT
*CHI: <that> [/] (.) that (i)s it . [+ bch]
%mor: pro:dem|that cop|be&3S pro:per|it .
%gra: 1|2|SUBJ 2|0|ROOT 3|2|PRED 4|2|PUNCT
*CHI: I saw an elephant go swimming .
%mor: pro:sub|I v|see&PAST det:art|a n|elephant v|go part|swim-PRESP .
%gra: 1|2|SUBJ 2|0|ROOT 3|4|DET 4|5|SUBJ 5|2|COMP 6|5|OBJ 7|2|PUNCT
*CHI: <I saw eleph> [//] I saw the <g> [/] giraffe and the elephant <s>
[//] drop ball in the pool .
%mor: pro:sub|I v|see&PAST det:art|the n|giraffe coord|and det:art|the
n|elephant n|drop n|ball prep|in det:art|the n|pool .
%gra: 1|2|SUBJ 2|0|ROOT 3|4|DET 4|2|OBJ 5|4|CONJ 6|9|DET 7|9|MOD 8|9|MOD
9|5|COORD 10|9|NJCT 11|12|DET 12|10|POBJ 13|2|PUNCT
*CHI: I saw giraffe swimming in the pool to get that ball .
%mor: pro:sub|I v|see&PAST n|giraffe part|swim-PRESP prep|in det:art|the
n|pool inf|to v|get pro:dem|that n|ball .
我应该返回“ * CHI:”和“%mor:”之间的所有句子 我的代码只带了第一行
I saw a giraffe and a elephant
通过迭代到字符串末尾来帮助我,我应该能够打印出所有在“ * CHI:”和“%mor:”之间的句子。
答案 0 :(得分:1)
因此,我使用了正则表达式和简单字符串而不是文件,但这是相同的原理。检查工作代码:
import re
s = """
*CHI: I saw a giraffe and a elephant .
%mor: pro:sub|I v|see&PAST det:art|a n|giraffe coord|and det:art|a
n|elephant .
%gra: 1|2|SUBJ 2|0|ROOT 3|4|DET 4|2|OBJ 5|4|CONJ 6|7|DET 7|5|COORD 8|2|PUNCT
*CHI: <that> [/] (.) that (i)s it . [+ bch]
%mor: pro:dem|that cop|be&3S pro:per|it .
%gra: 1|2|SUBJ 2|0|ROOT 3|2|PRED 4|2|PUNCT
*CHI: I saw an elephant go swimming .
%mor: pro:sub|I v|see&PAST det:art|a n|elephant v|go part|swim-PRESP .
%gra: 1|2|SUBJ 2|0|ROOT 3|4|DET 4|5|SUBJ 5|2|COMP 6|5|OBJ 7|2|PUNCT
*CHI: <I saw eleph> [//] I saw the <g> [/] giraffe and the elephant <s>
[//] drop ball in the pool .
%mor: pro:sub|I v|see&PAST det:art|the n|giraffe coord|and det:art|the
n|elephant n|drop n|ball prep|in det:art|the n|pool .
%gra: 1|2|SUBJ 2|0|ROOT 3|4|DET 4|2|OBJ 5|4|CONJ 6|9|DET 7|9|MOD 8|9|MOD
9|5|COORD 10|9|NJCT 11|12|DET 12|10|POBJ 13|2|PUNCT
*CHI: I saw giraffe swimming in the pool to get that ball .
%mor: pro:sub|I v|see&PAST n|giraffe part|swim-PRESP prep|in det:art|the
n|pool inf|to v|get pro:dem|that n|ball .
"""
result = re.findall('(?<=CHI:)(.*?)(?=%mor)', s, flags=re.S)
print(result)
答案 1 :(得分:0)
试图在不使用正则表达式的情况下执行此操作,尽管我仍不确定是否允许%gra
在两者之间打印。
def find_between( string, first, last ):
flag = True
try:
buffer = ""
op_string = ""
for line in string:
if first in line:
buffer += line
flag = True
elif last in line:
op_string += buffer
buffer = "" # flush buffer
flag = False
elif flag is True:
buffer += line
print(op_string)
except ValueError:
return ""
with open("sample.txt")as f:
data = f.readlines()
#print(data)
find_between( data, "*CHI: " , "%mor: " )