我正在尝试从python中的pdb输入文件打印预定义的序列,但我没有得到预期的结果。我是python的新手,我也有导入目录,但它不起作用。没有显示任何内容(无法找到错误)。它刚刚运行没有任何输出。
import os
os.chdir('C:\Users\Vishnu\Desktop\Test_folder\Input')
for path, dirs, pdbfile in os.walk('/C:\Users\Vishnu\Desktop\Test_folder\Input'):
for line in pdbfile:
if line[:6] != "HETATM":
continue
chainID = line[21:22]
atomID = line[13:16].strip()
if chainID not in ('A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'):
continue
if atomID not in ('C4B', 'O4B', 'C1B', 'C2B', 'C3B'):
continue
with open('C:\Users\Vishnu\Desktop\Test_folder\Input', 'r') as fh:
new = [line.rstrip() for line in fh]
with open('C:\Users\Vishnu\Desktop\Test_folder\Output', 'w') as fh:
[fh.write('%s\n' % line) for line in new]
fh.write((line.rstrip()))
预期产出:
HETATM 3788 C4B NAI A 302 52.695 15.486 8.535 1.00 57.28 C
HETATM 3789 O4B NAI A 302 52.258 14.631 7.456 1.00 56.26 O
HETATM 3794 C1B NAI A 302 53.348 13.816 7.022 1.00 53.44 C
HETATM 3792 C2B NAI A 302 54.537 14.748 7.190 1.00 50.93 C
HETATM 3789 O4B NAI A 302 52.258 14.631 7.456 1.00 56.26 O
HETATM 3794 C1B NAI A 302 53.348 13.816 7.022 1.00 53.44 C
HETATM 3792 C2B NAI A 302 54.537 14.748 7.190 1.00 50.93 C
HETATM 3790 C3B NAI A 302 54.225 15.525 8.465 1.00 52.99 C
HETATM 3794 C1B NAI A 302 53.348 13.816 7.022 1.00 53.44 C
HETATM 3792 C2B NAI A 302 54.537 14.748 7.190 1.00 50.93 C
HETATM 3790 C3B NAI A 302 54.225 15.525 8.465 1.00 52.99 C
HETATM 3788 C4B NAI A 302 52.695 15.486 8.535 1.00 57.28 C
HETATM 3792 C2B NAI A 302 54.537 14.748 7.190 1.00 50.93 C
HETATM 3790 C3B NAI A 302 54.225 15.525 8.465 1.00 52.99 C
HETATM 3788 C4B NAI A 302 52.695 15.486 8.535 1.00 57.28 C
HETATM 3789 O4B NAI A 302 52.258 14.631 7.456 1.00 56.26 O
HETATM 3790 C3B NAI A 302 54.225 15.525 8.465 1.00 52.99 C
HETATM 3788 C4B NAI A 302 52.695 15.486 8.535 1.00 57.28 C
HETATM 3789 O4B NAI A 302 52.258 14.631 7.456 1.00 56.26 O
HETATM 3794 C1B NAI A 302 53.348 13.816 7.022 1.00 53.44 C
B链的格式也相同。
如何打印预定义序列? line [21:22]是否有链ID,链ID可能是A到H.如何定义A到H链ID?
我无法按顺序打印,任何人都可以告诉我如何在python中打印预定义的序列吗?
答案后:
我已使用以下代码更新了上述代码:
n = 4
for chain, atoms in d.items():
for atom, line in atoms.items():
for i in range(len(atom)-n+1):
for j in range(n):
print d[chain][atomIDs[i+j]]
print
我想延长两个段落,但没有获得预期的输出
答案 0 :(得分:1)
以下是我的评论全部合并到一个答案:
with open('1AHI.pdb') as pdbfile:
for line in pdbfile:
if line[:6] != "HETATM":
continue
chainID = line[21:22]
atomID = line[13:16].strip()
if chainID not in ('A', 'B'):
continue
if atomID not in ('C4B', 'O4B', 'C1B', 'C2B', 'C3B'):
continue
## Either:
print(line, end='')
## Or:
print(line.rstrip(), end='\n')
## Or if Python2.x:
print line.rstrip()
我的第一行代码是在10多年前解析PDB文件时编写的。不要绝望。你有一个漫长而美好的旅程。
P.S。我认为mmCIF最近比PDB更喜欢...确保你阅读了两种文件格式的规范。
我已经更新了答案,但请注意,此网站用于解决特定问题,而不是其他人为您完成工作。它通常被低估。
d = {}
chainIDs = ('A', 'B',)
atomIDs = ('C4B', 'O4B', 'C1B', 'C2B', 'C3B', 'C4B')
with open('1AHI.pdb') as pdbfile:
for line in map(str.rstrip, pdbfile):
if line[:6] != "HETATM":
continue
chainID = line[21:22]
atomID = line[13:16].strip()
if chainID not in chainIDs:
continue
if atomID not in atomIDs:
continue
try:
d[chainID][atomID] = line
except KeyError:
d[chainID] = {atomID: line}
n = 4
for chainID in chainIDs:
for i in range(len(atomIDs)-n+1):
for j in range(n):
print d[chainID][atomIDs[i+j]]
print