Question

假设我想使用forloop自动为大标题行生成索引，以防止为每个标题写入索引。

在一个文件中，我说了一个带有很多水果名称的标题。每列都有一个数据，我必须使用索引访问下游解析。我没有为每个水果名称准备索引，而是想运行一个forloop来动态创建索引值以节省时间。

data = 

      apple                     banana              orange
      genus:x,species:b    genus:x,species:b     genus:x,species:b
      genus:x,species:b    genus:x,species:b     genus:x,species:b
      variety:gala,pinklady,...  variety:wild,hybrid...   variety:florida,venz,
      flavors:tangy,tart,sweet..
      global_consumption:....
      pricePerUnit:...
      seedstocks:.....
      insect_resistance:.....
      producer:....


# first I convert the header into list like this:

for lines in data:
    if 'apple' in lines:
        fruits = lines.split('\t')
        # this will give me header as list:
        # ['apple', 'banana', 'orange']

        # then create the index as:           
        for x in fruits:
            str(x) + '_idx' = fruits.index(x)  
            # this is where the problem is for me .. !??   
            # .. because this is not valid python method
            print(x)

            # if made possible, new variable are created as
            apple_idx = 0, banana_idx = 1 ... so on

# Now, start mining your data for interested fruits
     data = lines.split('\t')
     apple_values = data[apple_idx]
     for values in apple_values:
          do something ......

     same for others. I also need to do several other things.

Make sense??

如何才能实现这一目标？以一种非常简单的方式。

发布编辑：经过大量阅读后，我意识到可以使用bash中另一个变量的variable_name创建value(string) ：

how to use a variable's value as other variable's name in bash

https://unix.stackexchange.com/questions/98419/creating-variable-using-variable-value-as-part-of-new-variable-name

但是，正如我原先想象的那样，在python中是不可能的。我的直觉是，可以在python编程语言中准备这个方法（如果被黑客攻击或者作者决定），但是python的作者也可能会想到并知道可能存在的危险或使用这种方法。

危险在于你总是希望在编写的python脚本中看到variable_name。准备动态variable_names本来不错，但如果出现任何问题，它可能会在追溯时导致问题。
因为，变量名从未输入过，如果出现任何问题（特别是在大型程序中），跟踪和调试将是一场噩梦，比如当variable_value类似于2BetaTheta或*ping^pong时不是有效的variable_name。这是我的想法。 请其他人知道为什么这个功能没有在python中引入？
Dict方法结束了这个问题，因为我们有variable_name的来源记录，但仍然有效与无效variable_name的问题不会消失。

我将使用dict method采取一些提供的答案，看看我是否可以找到一种非常简单的综合方法来实现这一目标。

谢谢大家！

Answer 1

希望下面的代码能够为您提供一些有关前进方法的建议。实际上有更好的方法来做这些事情，但对于初学者来说，最好先学习基础知识。请注意：下面的代码并没有什么错误，但如果我们使用更高级的概念，它可能会更短，甚至更有用。

# get the headers from the first line out of the data
# this won't work if the headers are not on the first line
fruits = data[0].split('\t')

# now you have this list, as before
>>> ['apple', 'banana', 'orange']

# make a dictionary that will hold a data list
# for each fruit; these lists will be empty to start
# each fruit's list will hold the data appearing on 
# each line in the data file under each header
data_dict = dict()
for fruit in data_dict:
    data_dict[fruit] = [] # an empty list

# now you have a dictionary that looks like this
>>> {'apple': [], 'banana': [], 'orange': []}

# you can access the (now empty) lists this way
>>> data_dict['apple']
[]

# now use a for loop to go through the data, but skip the 
# first line which you already handled
for lines in data[1:]:
    values = lines.split('\t')
    # append the values to the end of the list for each 
    # fruit. use enumerate so you know the index number
    for idx,fruit in enumerate(fruits):
        data_dict[fruit].append(values[idx])

# now you have the data dictionary that looks like this
>>> {'apple': ['genus:x,species:b', 'genus:x,species:b'], 
     'banana': ['genus:x,species:b', 'genus:x,species:b'], 
     'orange': ['genus:x,species:b', 'genus:x,species:b']}

print("<<here's some interesting data about apples>>")
# Mine the data_dict for interesting fruits this way
data_list = fruits['apple']
for data_line in data_list:
    genus_and_species = data_line.split(',')
    genus = genus_and_species[0].split(':')[1] 
    species = genus_and_species[1].split(':')[1] 
    print("\tGenus: ",genus,"\tSpecies: ",species)

如果您想查看所有水果（按照以前的原始顺序），您可以这样做：

for fruit in fruits:
    data_list = data_dict[fruit]
    for data_line in data_list:
        print(data_line)

如果你不关心订单（dicts没有订单*），你可以忘记你的水果清单，然后循环遍历数据字典本身：

for fruit in data_dict:
    print(fruit)

要获取值（数据列表），请使用values（Python 2.7中的viewvalues）：

for data_list in data_dict.values():
    print(data_list)

要获取密钥（水果）和值，请使用items（Python 2.7中的viewitems）：

for fruit,data_list in data_dict.items():
    print(data_list)

提示：如果你想改变（改变）字典，不要使用for fruit in data_dict:。相反，您需要确保使用values，items或keys（Python 2.7中的viewkeys）方法。如果不这样做，您将遇到问题：

for fruit in data_dict.keys():
    # remove it
    data_dict.pop(fruit)

*快速说明：dict已经进行了一些更改，很可能会让您假设他们将在即将到来的下一版Python（3.7）中记住他们的订单。

Answer 2

编辑：既然问题已被编辑，如果我有时间，我会在以后提供更有用的答案。

我不完全明白你实际上要做的是什么，但这里有一些可能会有所帮助的事情。

要识别的是，您已经拥有一个包含所有信息的对象：包含所有对象名称的列表。就其本质而言，您的名称列表中已包含索引。数据存在;它就在那里。您需要做的是学会以正确的方式访问此信息。

您可能需要的是enumerate function。此函数生成一个两元组（这是一对对象），包含列表索引和列表内容：

for idx,fruit in enumerate(fruits): 
    print(fruit+'_idx: ', idx)

没有理由在其他一些数据结构中存储这些索引;它们已列入您的列表中。

如果您坚持要通过某个名称（字符串）访问某个任意值，则应使用字典或dict：

fruit_dict = dict()
fruit_dict['apple'] = 1

但是，由于您在 index 值之后，这似乎有点奇怪，因为dict本质上是无意的。正如我所说，你已经知道列表中的索引了。尽管可能存在您想要这样做的情况，但是第二次使用名称存储索引最不可能有意义。

Answer 3

内置函数exec和eval与此相关。

来自Python documentation：

eval：“表达式参数被解析并计算为Python表达式”
exec：“此函数支持动态执行Python代码”

真的，您只需要exec来解决您的问题，如下所示：

for fruit in fruits: exec('{0}_idx = fruits.index("{0}")'.format(fruit))

（请注意，我们需要在第二个{}中引用，否则Python会认为您正在尝试获取名为apple的某个变量的索引，而不是将其传递给字符串{{1 }}

如果现在在控制台中键入'apple'（例如），则应返回apple_idx。

如何使用for循环使用列表元素中的值自动生成变量？

3 个答案: