我遇到了一个我无法找到并修复的问题。
FASTA = >header1
ATCGATCGATCCCGATCGACATCAGCATCGACTAC
ATCGACTCAAGCATCAGCTACGACTCGACTGACTACGACTCGCT
>header2
ATCGATCGCATCGACTACGACTACGACTACGCTTCGTATCAGCATCAGCT
ATCAGCATCGACGACGACTAGCACTACGACTACGACGATCCCGATCGATCAGCT
def dnaSequence():
'''
This function makes a dict called DNAseq by reading the fasta file
given as first argument on the command line
INPUT: Fasta file containing strings
OUTPUT: key is header and value is sequence
'''
DNAseq = {}
for line in FASTA:
line = line.strip()
if line.startswith('>'):
header = line
DNAseq[header] = ""
else:
seq = line
DNAseq[header] = seq
return DNAseq
def digestFragmentsWithOneEnzyme(dnaSequence):
'''
This function digests the sequence from DNAseq into smaller parts
by using the enzymes listed in the MODES.
INPUT: DNAseq and the enzymes from sys.argv[2:]
OUTPUT: The DNAseq is updated with the segments gained from the
digesting
'''
enzymes = sys.argv[2:]
updated_list = []
for enzyme in enzymes:
pattern = MODES(enzyme)
p = re.compile(pattern)
for dna in DNAseq.keys():
matchlist = re.findall(p,dna)
updated_list = re.split(MODES, DNAseq)
DNAseq.update((key, updated_list.index(k)) for key in
d.iterkeys())
return DNAseq
def getMolecularWeight(dnaSequence):
'''
This function calculates the molWeight of the sequence in DNAseq
INPUT: the updated DNAseq from the previous function as a dict
OUTPUT: The DNAseq is updated with the molweight of the digested fragments
'''
results = []
for seq in DNAseq.keys():
results = sum((dnaMass[base]) for base in DNAseq[seq])
DNAseq.update((key, results.index(k)) for key in
d.iterkeys())
return DNAseq
def main(argv=None):
'''
This function prints the results of the digested DNA sequence on in the terminal.
INPUT: The DNAseq from the previous function as a dict
OUTPUT: name weight weight weight
name2 weight weight weight
'''
if argv == None:
argv = sys.argv
if len(argv) <2:
usage()
return 1
digestFragmentsWithOneEnzyme(dnaSequence())
Genes = getMolecularWeight(digestFragmentsWithOneEnzyme())
print ({header},{seq}).format(**DNAseq)
return 0
if __name__ == '__main__':
sys.exit(main())
在第一个函数中,我试图从fasta文件中创建dict
,在第二个函数中使用相同的dict
,其中序列由正则表达式进行切片,最后是{{1正在计算中。
我的问题是,由于某些原因,Python无法识别我的molweight
并收到错误:
名称错误DNAseq未定义
如果我在功能之外设dict
,那么我确实拥有dict
。
答案 0 :(得分:1)
您将dict作为dnaSequence
传递给两个函数,而不是DNAseq
。
注意这是一种非常奇怪的调用函数的方法。当你将序列传递给它时,你完全忽略了对digestFragmentsWithOneEnzyme
的第一次调用的结果,然后尝试再次调用它以将结果传递给getMolecularWeight
,但你实际上无法在该调用中传递序列,如果你走得这么远,那实际上就会出错。
我认为你要做的是:
sequence = dnaSequence()
fragments = digestFragmentsWithOneEnzyme(sequence)
genes = getMolecularWeight(fragments)
并且你应该避免将参数调用到两个与单独函数同名的函数,因为这会隐藏函数名。而是选择一个新名称:
def digestFragmentsWithOneEnzyme(sequence):
...
for dna in sequence:
(你不需要调用keys()
- 迭代dict总是在键上。)