Question

我有一个大型的txt文件sample.txt，其中包含超过54000列。它们的排序如下：

1011001 1 1001164 981328 1 -9 A G G G G G C C A . . . .    
1011002 1 1001164 981328 1 -9 A G G G G G A C A . . . .

我需要按以下顺序重新排列列：

1 1011001 1001164 981328 1 -9 A G G G G G C C A . . . .
1 1011002 1001164 981328 1 -9 A G G G G G A C A . . . .

即我希望第二列是第一列。

我是否可以使用Python进行此操作？

Answer 1

尝试一下：

elements=[]
with open(filename,"r") as f:
    for e in f.readlines():
        line = e.split(" ")
        line0 = line[0]
        line[0] = line[1]
        line[1] = line0
        elements.append(" ".join(line))
with open(filename,"w") as f:
    f.write("\n".join(elements))

或者，如果以上代码由于文件大小而崩溃，则您可以像这样一次完成所有操作：

with open(filename,"r") as f:
    with open(filename2,"w") as f2:
        for e in f.readlines():
            line = e.split(" ")
            line0 = line[0]
            line[0] = line[1]
            line[1] = line0
            f2.write(" ".join(line) + "\n")

...其中filename2是其他文件名。运行代码后，将filename替换为filename2，就可以完成。

Answer 2

列表理解：

with open(filename,'r') as f:
   l=[' '.join([i.split()[1],i.split()[0],i.split()[2]])+'\n' for i in f.readlines()]
with open(filename,'w') as f:
   f.writelines(l)

或者在这种情况下：

with open(filename,'r') as f:
   l=[' '.join([i.split()[1],i.split()[0],i.split()[2:]])+'\n' for i in f.readlines()]
with open(filename,'w') as f:
   f.writelines(l)

Answer 3

对于54000列，我将使用正则表达式，这是快速的：

import re

with open('sample.txt', 'r') as f_in, open('sample_out.txt', 'w', newline='') as f_out:
    for line in f_in.readlines():
        g = re.findall(r'[^\s]+', line)
        if g:
            f_out.write(' '.join([g[1], g[0]] + g[2:]) + '\n')

如何在txt文件中重新排序cloumn？

3 个答案: