我收到了这个包含不同列号的巨大TXT文件。
jero, kor@gmail.com, 44d448e4d, team, 0, 6, 5, 2, s, s, s, none, none
jader, lda@gmail.com, d44a88x, team, 0, none, 48, 95, oled
etc for 15000 lines
我想在"团队"之后切断一切。单词,在每一行。我尝试了几个正则表达式,但无法成功。
谢谢!
答案 0 :(得分:3)
没有必要使用正则表达式,有一个直接的解决方案。
with open('file.txt') as f:
for line in f:
i = line.split('team')[0] + "team"
答案 1 :(得分:2)
好吧,如果您要解析CSV文件,请使用dedicated module:
import csv
for row in csv.reader(your_file, skipinitialspace=True):
if 'team' in row:
row = row[:row.index('team')+1]
print ', '.join(row)
这样可以避免您在使用jero_team, kor@team.com, 44d448e4d, team, 0, one_more, team, 5, 2
答案 2 :(得分:1)
您不需要使用正则表达式。使用str.partition
:
>>> line = 'jero, kor@gmail.com, 44d448e4d, team, 0, 6, 5, 2, s, s, s, none, none'
>>> a, sep, _ = line.partition('team')
>>> a
'jero, kor@gmail.com, 44d448e4d, '
>>> sep
'team'
>>> a + sep
'jero, kor@gmail.com, 44d448e4d, team'
with open('file.txt') as f:
for line in f:
a, sep, _ = line.partition('team')
line = a + sep
# Do something with line
<强>更新强>
要解决@DSM提到的问题:拆分包含team
的其他字段:
with open('file.txt') as f:
for line in f:
a, sep, _ = line.partition(', team,')
line = a + sep
# Do something with line