Question

所以，我对python很新，我不确定我的代码是否最有效，但如果有人能向我解释为什么我的脚本返回“名称未定义”错误，我仍然会非常感激运行。我在一个单独的文件中有一个300个基因名称的列表，每行一个名称，我想要读取，并将每一行存储为字符串变量。

在脚本中我有600个变量的列表。 300个变量标记为name_bitscore，300个标记为name_length，每个300个名称。我想根据条件筛选列表。我的脚本看起来像这样：

#!/usr/bin/python
with open("seqnames-test1-iso-legal-temp.txt") as f:
    for line in f:
        exec("b="+line+"_bitscore")
        exec("l="+line+"_length")
        if 0.5*b <= 2*1.05*l and 0.5*b >= 2*0.95*l:
            print line
ham_pb_length=2973
ham_pb_bitscore=2165
g2225_ph_length=3303
cg2225_ph_bitscore=2278

等。对于长度和位核变量。

基本上，我在这里尝试做的是读取文件“seqnames-test1-iso-legal-temp.txt”的第1行，即ham_pb。然后我想使用exec函数来创建一个变量b = ham_pb_bitscore和l = ham_pb_length，这样我就可以测试基因bitcore的一半值是否在其长度的两倍范围内，误差为5％。然后，对每个基因重复此操作，即文件的每一行“seqnames-test1-sio-legal-temp.txt”。

当我执行脚本时，收到错误消息：

Traceback (most recent call last):
  File "duplicatebittest.py", line 4, in <module>
    exec("b="+line+"_bitscore")
  File "<string>", line 1, in <module>
NameError: name 'ham_pb' is not defined

我制作了另一个简短的脚本，以确保我正确使用了exec函数，如下所示：

#!/usr/pin/python
name="string"
string_value=4
exec("b="+name+"_value")
print(name)
print(b)

然后返回：

string
4

所以，我知道我可以使用exec在变量声明中包含一个字符串变量，因为b按预期返回4。所以，我不知道为什么我的第一个脚本出现错误。

我测试通过输入

确保变量行是一个字符串

#!/usr/bin/python
    with open("seqnames-test1-iso-legal-temp.txt") as f:
        for line in f:
            print type(line)

它返回了行

<type 'str'>

300次，所以我知道每个变量行都是一个字符串，这就是为什么我不明白为什么我的测试脚本有效，但是这个没有。

任何帮助都会受到超级赞赏！

Answer 1

line由文本文件迭代器产生，它为每行读取发出换行符。

所以你的表达：

exec("b="+line+"_bitscore")

传递给exec：

b=ham_pb
_bitscore

剥离输出，这将起作用

exec("b="+line.rstrip()+"_bitscore")

如果在循环之前移动以下行，则声明变量：

ham_pb_length=2973 ham_pb_bitscore=2165 g2225_ph_length=3303 cg2225_ph_bitscore=2278

更好：退出使用exec并使用词典来避免动态定义变量。

Answer 2

将#!/usr/bin/env python作为第一行。有关详细说明，请参阅this问题。

正如Jean指出的那样，exec不适合这项工作。你应该使用字典，因为它们不那么危险（搜索代码注入）和字典更容易阅读。这是一个如何使用python文档中的字典的例子：

>>> tel = {'jack': 4098, 'sape': 4139}
>>> tel['guido'] = 4127
>>> tel
{'sape': 4139, 'guido': 4127, 'jack': 4098}
>>> tel['jack']
4098
>>> del tel['sape']
>>> tel['irv'] = 4127
>>> tel
{'guido': 4127, 'irv': 4127, 'jack': 4098}
>>> list(tel.keys())
['irv', 'guido', 'jack']
>>> sorted(tel.keys())
['guido', 'irv', 'jack']
>>> 'guido' in tel
True
>>> 'jack' not in tel
False

这是我能想到的实现目标的方式：

with open("seqnames-test1-iso-legal-temp.txt") as f:
    gene_data = {'ham_pb_length':2973, 'am_pb_bitscore':2165,
                 'g2225_ph_length':3303, 'cg2225_ph_bitscore':2278}
    '''maybe you have more of these gene data things. If so,
    just append them to the end of the above dictionary literal'''
    for line in f:
        if not line.isspace():
            bitscore = gene_data[line.rstrip()+'_bitscore']
            length = gene_data[line.rstrip()+'_bitscore']
            if (0.95*length <= bitscore/4 <= 1.05*length):
                print line

我在这里利用了一些有用的python功能。在python3中，5/7评估为0.7142857142857143，而不是许多编程语言中的典型0。如果你想在python3中进行整数除法，请使用5//7。此外，在python 1<2<3中评估为True，而1<3<2评估为False，而在许多编程语言中，1<2<3评估为True<3根据编程语言给出错误或评估为True。

逐行读取文件时，名称未定义错误python

2 个答案: