我有一个大型的txt文件sample.txt
,其中包含超过54000列。它们的排序如下:
1011001 1 1001164 981328 1 -9 A G G G G G C C A . . . .
1011002 1 1001164 981328 1 -9 A G G G G G A C A . . . .
我需要按以下顺序重新排列列:
1 1011001 1001164 981328 1 -9 A G G G G G C C A . . . .
1 1011002 1001164 981328 1 -9 A G G G G G A C A . . . .
即我希望第二列是第一列。
我是否可以使用Python进行此操作?
答案 0 :(得分:2)
尝试一下:
elements=[]
with open(filename,"r") as f:
for e in f.readlines():
line = e.split(" ")
line0 = line[0]
line[0] = line[1]
line[1] = line0
elements.append(" ".join(line))
with open(filename,"w") as f:
f.write("\n".join(elements))
或者,如果以上代码由于文件大小而崩溃,则您可以像这样一次完成所有操作:
with open(filename,"r") as f:
with open(filename2,"w") as f2:
for e in f.readlines():
line = e.split(" ")
line0 = line[0]
line[0] = line[1]
line[1] = line0
f2.write(" ".join(line) + "\n")
...其中filename2
是其他文件名。运行代码后,将filename
替换为filename2
,就可以完成。
答案 1 :(得分:2)
列表理解:
with open(filename,'r') as f:
l=[' '.join([i.split()[1],i.split()[0],i.split()[2]])+'\n' for i in f.readlines()]
with open(filename,'w') as f:
f.writelines(l)
或者在这种情况下:
with open(filename,'r') as f:
l=[' '.join([i.split()[1],i.split()[0],i.split()[2:]])+'\n' for i in f.readlines()]
with open(filename,'w') as f:
f.writelines(l)
答案 2 :(得分:2)
对于54000列,我将使用正则表达式,这是快速的:
import re
with open('sample.txt', 'r') as f_in, open('sample_out.txt', 'w', newline='') as f_out:
for line in f_in.readlines():
g = re.findall(r'[^\s]+', line)
if g:
f_out.write(' '.join([g[1], g[0]] + g[2:]) + '\n')