从文本文件中删除重复的行

时间:2014-03-20 08:25:37

标签: python

我需要删除txt文件中的重复行,即:

ATOM      1  N   MET B   1      43.567   2.228  13.359  1.00159.33           N  
ATOM      2  N   MET B   1      43.391   2.228  74.594  1.00159.33           N  
ATOM      3  CA  MET B   1      42.581   2.361  14.428  1.00160.56           C  
ATOM      4  CA  MET B   1      44.377   2.361  73.525  1.00160.56           C 

所以我想删除一行:

ATOM      2  N   MET B   1      43.391   2.228  74.594  1.00159.33           N  
ATOM      4  CA  MET B   1      44.377   2.361  73.525  1.00160.56           C 

我已尝试使用此代码实现此目的,但不幸的是它无法正常工作。

f=open("A.pdb").readlines()
lis=[]
for line in f:
    lis.append(line)
print (lis) 
length=len(lis)
element=0
array=[]
while element<length:
    if lis[element][13:16] == lis[element+1][13:16]:
        array.append(element)


for elements in array:
    lis.pop(array[elements])

1 个答案:

答案 0 :(得分:1)

此版本更改了N N N CA CA N&#39;这是你的要求吗?

。&#39; N CA N&#39;
result = []
previous_keyword = None
with open('A.pdb') as f:
    for line in f:
        # use these five lines if keyword is fixed at 3rd column, and columns are separated by whitespace
        try:
            keyword = line.split()[2]
        except:
            print('Line with unknown format: ' + line)
            continue

        # use this one if the keyword is fixed at position[13:16]
        #keyword = line[13:16]

        if keyword != previous_keyword:
            result.append(line)
            #result.append(line.rstrip())     use this one if you don't want trailing 'newline'
            previous_keyword = keyword

for x in result:
    print x

你的程序&#34;暂停和永不结束的原因&#34;:在这次迭代中,你永远不会增加&#39;元素&#39;

while element<length:
    if lis[element][13:16] == lis[element+1][13:16]:
        array.append(element)