Python字典存储的值是否错误?

时间:2019-07-09 22:26:27

标签: python excel python-3.x dictionary openpyxl

我正在使用Python的openpyxl包从excel文件中读取内容,并将单元格值及其父级值存储在字典中。非粗体的单元格被视为“任务”,粗体的单元格被视为“摘要”。

这是我尝试读取的Excel文件的示例: example excel file

对于每个任务,我想将任务名称及其摘要(作为列表)存储在字典中。例如,在示例excel文件中,任务4将以名称“任务4”存储,其摘要为['First Summary','Nested Summary 2']。我根据前导空格计算嵌套的父级摘要。

我的问题是,在while循环中,摘要列表计算正确,而当我在字典中打印所有任务名称和摘要时,摘要是错误的。

from openpyxl import load_workbook

wb = load_workbook(filename='example.xlsx')
sheet = wb['Sheet1']

tasks = {}

task_summaries = []
curr_left_spaces = -1

i = 2
current_cell = sheet[f'A{i}']

while current_cell.value:
    if current_cell.font.bold:
        # calculate number of leading spaces to determine nesting level
        left_spaces = num_left_spaces(current_cell.value) 
        curr_summary = current_cell.value.strip()

        if left_spaces > curr_left_spaces:
            task_summaries.append(curr_summary)
            curr_left_spaces = left_spaces
        elif left_spaces < curr_left_spaces:
            task_summaries = [curr_summary]
            curr_left_spaces = left_spaces
        else:
            assert (left_spaces == curr_left_spaces)
            task_summaries.pop()
            task_summaries.append(curr_summary)

    else:
        task_name = current_cell.value.strip() 

        # prints correct task_summaries list here
        print(task_name, task_summaries) 

        tasks[task_name] = task_summaries

    i += 1
    current_cell = self.sheet[f'A{i}']


for name, summary in tasks.items():
    print(name, summary) # summary is incorrect here

预期结果:

Task 1 ['First Summary']
Task 2 ['First Summary', 'Nested Summary 1']
Task 3 ['First Summary', 'Nested Summary 1']
Task 4 ['First Summary', 'Nested Summary 2']
Task 5 ['Second Summary']
Task 6 ['Second Summary']
Task 1 ['First Summary']
Task 2 ['First Summary', 'Nested Summary 1']
Task 3 ['First Summary', 'Nested Summary 1']
Task 4 ['First Summary', 'Nested Summary 2']
Task 5 ['Second Summary']
Task 6 ['Second Summary']

实际结果:

Task 1 ['First Summary']
Task 2 ['First Summary', 'Nested Summary 1']
Task 3 ['First Summary', 'Nested Summary 1']
Task 4 ['First Summary', 'Nested Summary 2']
Task 5 ['Second Summary']
Task 6 ['Second Summary']
Task 1 ['First Summary', 'Nested Summary 2']
Task 2 ['First Summary', 'Nested Summary 2']
Task 3 ['First Summary', 'Nested Summary 2']
Task 4 ['First Summary', 'Nested Summary 2']
Task 5 ['Second Summary']
Task 6 ['Second Summary']

1 个答案:

答案 0 :(得分:2)

您的问题是,您对所有条目使用相同的task_summaries列表,并将新任务添加到字典中,并且它们的值引用了同一列表。

因此,最后所有条目的值都是列表['First Summary', 'Nested Summary 2'],然后才在任务5中执行task_summaries = [curr_summary],它为task_summaries创建了一个新对象,现在是最后一个两个任务引用了相同的列表。

您需要做的是为每个条目提供一个新列表,因此请更改此行:

tasks[task_name] = task_summaries

收件人:

tasks[task_name] = list(task_summaries)

一个简单的示例来演示:

>>> l = [1, 2]
>>> d = {}
>>> d['a'] = l   #  'a' gets a reference to l
>>> l[0] = 3     # so that changes 'a's value too
>>> print(l)
[3, 2]
>>> print(d)
{'a', [3, 2]}

>>> d['a'] = list(l)  # now 'a' gets a new copy of l
>>> l[0] = 4          # so that shouldn't affect him
>>> print(l)
[4, 2]
>>> print(d)
{'a', [3, 2]}