Question

#!/usr/bin/python
import os
path=os.getcwd()
print path
list_of_filenames=os.listdir(path+'//newfiles')
print list_of_filenames
residue=[]
for f in list_of_filenames:
        f1=open(path+'//newfiles//'+f).readlines()
        for line in f1:
                if line.startswith('HETATM'):
                        res_number=line[22:26]
                        if res_number not in residue and line[17:20]=='HOH':
                                residue.append(res_number)
                        else:
                                continue
                else:
                        continue
print(len(residue))

使用上面的脚本，我将所有文件中的'HOH'分子总数作为一个值。但我需要知道每个文件中有多少'HOH'分子。

所以请解释一下如何根据我的要求更改这个脚本。

Answer 1

进行最小修改以获取每个文件的出现次数。

residue=[]
for f in list_of_filenames:
        f1=open(path+'//newfiles//'+f).readlines()
        for line in f1:
                if line.startswith('HETATM'):
                        res_number=line[22:26]
                        if res_number not in residue and line[17:20]=='HOH':
                                residue.append(res_number)
                        else:
                                residue.append(0)  # changed
                else:
                        continue

for i in range(len(residue)):  # print each occurence
    print(residue[i])

Answer 2

以下答案将所有文件存储到字典中，然后迭代每个文件的使用次数。继续测试，如果您需要更多输入，请发表评论。

#!/usr/bin/python

di = {}

import os
path=os.getcwd()
print path
list_of_filenames=os.listdir(path+'//newfiles')
print list_of_filenames
residue=[]
for f in list_of_filenames:
        f1=open(path+'//newfiles//'+f).readlines()
        di[f] = 0
        for line in f1:
                if line.startswith('HETATM'):
                        res_number=line[22:26]
                        if res_number not in residue and line[17:20]=='HOH':
                                residue.append(res_number)
                                di[f] += 1
                        else:
                                continue
                else:
                        continue
print(len(residue))
print(di)

Answer 3

提示：defaultdict。

应单独跟踪单独的文件，并减少一些样板代码，使用defaultdict（set），它基本上将一个集存储到每个文件中。

#!/usr/bin/python
import os
path=os.getcwd()
list_of_filenames=os.listdir(path+'//newfiles')
residue = collections.defaultdict(set)
for f in list_of_filenames:
    with open(open(path+'//newfiles//'+f) as f1:  # close the file
        for line in f1.readlines():
                if line.startswith('HETATM'):
                        res_number=line[22:26]
                        if res_number not in residue[f] and line[17:20]=='HOH':
                                residue[f].add(res_number)
                        else:
                                continue
                else:
                        continue
print(residue)

如何使用python计算每个文件中'HOH'分子的总数

3 个答案: