我有一个看起来像这样的文件:
Breve, a writ; used more frequently in the plural brevia.
Brevia magistralia, official writs framed by the clerks in
chancery to meet new injuries, to which the old forms of action
were inapplicable. Sea Trespass on the case. Brevia testata,
short attested memoranda, originally introduced to obviate the
uncertainty arisina; from parol feoffments, hence modern con-
veyances have gradually arisen.
我想提取每行中第一个逗号(,)之前出现的单词
预期产出:
Breve
Brevia magistralia
chancery to meet new injuries
were inapplicable. Sea Trespass on the case. Brevia testata
short attested memoranda
uncertainty arisina; from parol feoffments
我的代码:
with open('test.txt','r') as file:
for line in file:
print(line[0:line.find(',')])
输出:
Breve
感谢任何帮助
答案 0 :(得分:1)
为什么需要正则表达式? str.split
应该足够好了。
with open('test.txt','r') as file:
for line in file:
text = line.split(',', 1)[0] # add nsplits = 1 for efficiency
... # do something with text
但是,如果你真的需要正则表达式,你可以使用类似的东西:
for line in file:
m = re.match('[^,]+', line)
if m:
text = m.group(0)
[^,]+
匹配起点的任何内容,不是逗号(credits)。
答案 1 :(得分:1)
短re.findall()
解决方案:
import re
with open('test.txt', 'r') as f:
result = re.findall(r'^[^,]+(?=,)', f.read(), re.M) # extracting the needed words
print('\n'.join(result))
输出:
Breve
Brevia magistralia
chancery to meet new injuries
were inapplicable. Sea Trespass on the case. Brevia testata
short attested memoranda
uncertainty arisina; from parol feoffments
答案 2 :(得分:1)
你正好进行这项修改,
Breve
Brevia magistralia
chancery to meet new injuries
were inapplicable. Sea Trespass on the case. Brevia testata
short attested memoranda
uncertainty arisina; from parol feoffments
输出:
aa bb cc
dd ee ff
gg hh ii
ll mm nn
oo pp qq
答案 3 :(得分:1)
这是一个额外的答案,你可以使用re.search:
import re
with open('test.txt','r') as file:
for line in file:
# print(line)
result = re.search(r'^[^,]+(?=,)', line )
if result:
text = result.group(0)
print(text)
<强>输出:强>
Breve
Brevia magistralia
chancery to meet new injuries
were inapplicable. Sea Trespass on the case. Brevia testata
short attested memoranda
uncertainty arisina; from parol feoffments
答案 4 :(得分:0)
我测试了你的代码,但根据你的问题得到了正确的输出
输出:
Breve
Brevia magistralia
chancery to meet new injuries
were inapplicable. Sea Trespass on the case. Brevia testata
short attested memoranda
uncertainty arisina; from parol feoffments
veyances have gradually arisen.
因此请确保您的输入文件本身正确
可能你的测试文件没有新行,即整个文本只写为一行。所以只打印第一个单词,然后找到一个逗号,所以不再单词被打印出来。
注意:最后一句,没有逗号,所以打印出所有单词(与预期输出不同)