我有大约4000万行文本需要解析,我希望将每一行视为一个拆分字符串,然后使用我在方法中生成的数字列表来请求多个切片(或下标,无论它们被调用)
# ...
other_file = open('output.txt','w')
list = [1, 4, 5, 7, ...]
for line in open(input_file):
other_file.write(line.split(',')[i for i in list])
下标不能使用我已经显示的这个生成器,但我想在分割线中询问其中的多个条目,而不必遍历每行中的列表。
我道歉,我知道这是一个简单的答案,但我无法想到它。太晚了!
答案 0 :(得分:4)
CSV模块可以帮助您
import csv
reader = csv.reader(open(input_file, 'r'))
writer = csv.writer(open(output_file, 'w'))
fields = (1,4,5,7,...)
for row in reader:
writer.writerow([row[i] for i in fields])
要进一步改进,请使用context managers
打开文件答案 1 :(得分:3)
不要将list
用作变量名称 - 请记住有一个内置名为list
other_file = open('output.txt','w')
lst = [1,4,5,7,...]
for line in open(input_file):
fields = line.split(',')
other_file.write(",".join(fields[i] for i in lst) + "\n")
为了进一步改进,请使用上下文管理器为您打开/关闭文件
答案 2 :(得分:1)
from operator import itemgetter
from csv import reader, writer
fields = 1,4,5,7
row_filter = itemgetter(*fields)
with open('inp.txt', 'r') as inp:
with open('out.txt', 'w') as out:
writer(out).writerows(map(row_filter, reader(inp)))