当我不应该将它传递给函数时,dict的值会发生变化

时间:2014-03-16 17:01:30

标签: python dictionary python-2.x

最近学习python,我有一个程序,它带有一个dicts列表,(列表中的每个dict都有从其他dicts重复的键)。

然后将它传递给一个函数,该函数的作用是聚合值中的数据并返回单个dict,但是当我在函数调用后再次访问原始dict时,它已使用来自的值更新单个字典,我在我的方法中看不到代码的任何部分,并且几个小时都被卡在上面。

这是我的代码:

#!/usr/bin/env python
import ast

def process_visitor_stats_list(original_list): 
    temp_original_list = original_list[:]  # attempt to copy original list so it doesnt get changed

    new_dict = {}   # this will store each unique key in dict along with the sum of its values
    for line in temp_original_list:  
        for key in line:
            if(key not in new_dict):  # checks if key is in new_dict, adds it if not and adds a value which tracks how often the key occurs
                new_dict[key] = line[key]
                new_dict[key].append(1)    # it also adds another number to the value, which stores the amount of times it was in the original list of dicts

            else:
                new_dict[key][0] += float(line[key][0])  # if key is already in dict, it sums its values 
                new_dict[key][1] += float(line[key][1])   
                new_dict[key][2] += 1 

    return new_dict


if __name__ == "__main__":

    original_list_of_dicts = []  # this will store my list of dicts

    line1 = "{'entry1': [4.0, 2.0], 'entry2': [592.0, 40.0], 'entry3': [5247044.0, 1093776.0], 'entry4': [1235.0, 82.0]}"
    line2 = "{'entry1': [26260.0, 8262.0], 'entry2': [2.0, 0.0], 'entry3': [1207.0, 142.0], 'entry4': [382992.0, 67362.0]}"
    line3 = "{'entry1': [57486.0, 16199.0], 'entry2': [6.0, 3.0], 'entry3': [280.0, 16.0]}"

    original_list_of_dicts.append(ast.literal_eval(line1))  # adds each line to the list and casts them as dicts
    original_list_of_dicts.append(ast.literal_eval(line2))
    original_list_of_dicts.append(ast.literal_eval(line3))

    print "original list of dicts before method call"
    for line in original_list_of_dicts:    # prints out each entry in the list of dicts
        for key in line:
            print key + str(line[key])

    print '\n'
    new_dict = process_visitor_stats_list(original_list_of_dicts)    # calls the method to process the original list of dicts
    print '\n'                                                      # this should return a single dict list with aggregate data


    print "original list of dicts after method call"
    for line in original_list_of_dicts:   # however when i go to access the original dict, its values have been changed
        for key in line:
            print key + str(line[key])

1 个答案:

答案 0 :(得分:2)

复制列表时:

temp_original_list = original_list[:]

您只执行副本,即新列表包含对原始列表中相同对象的引用。由于列表中的对象是可变字典,因此您需要执行列表的副本:

import copy

temp_original_list = copy.deepcopy(original_list)

这将递归复制容器中的对象并创建它们的新版本。

来自the documentation

  

浅层复制和深层复制之间的区别仅与复合对象(包含其他对象的对象,如列表或类实例)相关:

     
      
  • 浅复制构造一个新的复合对象,然后(尽可能)将对它的引用插入到原始对象中找到的对象。
  •   
  • 深层复制构造一个新的复合对象,然后以递归方式将副本插入到原始对象中找到的对象。
  •   

严格来说,您的问题与字典无关,而是与他们依次持有的列表无关(例如original_list[0]['entry1'])。在这一行:

new_dict[key] = line[key]

您正在引用new_dict中的相同列表对象,就像original_list中一样。因此,当你改变它时,例如:

new_dict[key].append(1)

此更改也会出现在原始字典中。因此,你也可以通过使内部列表成为副本来解决这个问题(这里只需要浅拷贝,因为它包含不可变值而不是可变容器):

new_dict[key] = line[key][:]