我有以下功能(属于一个类):
import Bio
from bioservices import KEGGParser, UniProt, QuickGO
def locate_common_GO(self,list_of_genes,GO_term):
#initialize variables and classes
q = QuickGO()
a = Retrieve_Data()
b=[]
#get the uniprot IDS using hugo2uniprot. hugo2uniprot is a custom method of my Retrieve_Data class (which uses bioservices module) simply for getting a uniprot ID from a gene symbol.
for i in range(0,len(list_of_genes)):
b.append(a.hugo2uniprot(list_of_genes[i],'hsa'))
print 'Gene: {} \t UniProtID: {}'.format(list_of_genes[i],b[i])
#search for GO terms and store as dictionary. Keys are the gene name and a list of GO terms are values.
GO_dict = {}
for i in range(0,len(b)):
q = QuickGO()
GO_dict[list_of_genes[i]]= q.Annotation(protein=b[i], frmt="tsv", _with=True,tax=9606, source="UniProt", col="goName")
keys = GO_dict.keys()
#This bit should search the dictionary values for a term supplied by the user (stored in the variable 'GO_Term').
#If the user supplied term is present in the retrieved list of GO terms I want it to add the dictionary key (i.e. the gene name) to a list named 'matches'.
matches = []
for gene in range(0,len(keys)):
if GO_term in GO_dict[keys[gene]].splitlines():
matches.append(keys[i])
return matches
问题在于,尽管提供了具有已知共同基因术语的基因列表,但该功能的输出始终是相同的基因名称。例如,' TGFB1'和' COL9A2'两者都有一个GO术语蛋白质细胞外基质'然而输出是一个列表,[' COL9A2'' COL9A2']应该是[' COL9A2'' TGFB1']。有没有人对如何修复这个程序有任何建议?我想我已经接近但我无法找到解决方案。
答案 0 :(得分:1)
您始终将keys[i]
附加到matches
,但i
在该循环中不会更改,因此您始终会附加相同的项目。您可能希望改为添加keys[gene]
。