我正在使用Python的openpyxl包从excel文件中读取内容,并将单元格值及其父级值存储在字典中。非粗体的单元格被视为“任务”,粗体的单元格被视为“摘要”。
这是我尝试读取的Excel文件的示例:
对于每个任务,我想将任务名称及其摘要(作为列表)存储在字典中。例如,在示例excel文件中,任务4将以名称“任务4”存储,其摘要为['First Summary','Nested Summary 2']。我根据前导空格计算嵌套的父级摘要。
我的问题是,在while循环中,摘要列表计算正确,而当我在字典中打印所有任务名称和摘要时,摘要是错误的。
from openpyxl import load_workbook
wb = load_workbook(filename='example.xlsx')
sheet = wb['Sheet1']
tasks = {}
task_summaries = []
curr_left_spaces = -1
i = 2
current_cell = sheet[f'A{i}']
while current_cell.value:
if current_cell.font.bold:
# calculate number of leading spaces to determine nesting level
left_spaces = num_left_spaces(current_cell.value)
curr_summary = current_cell.value.strip()
if left_spaces > curr_left_spaces:
task_summaries.append(curr_summary)
curr_left_spaces = left_spaces
elif left_spaces < curr_left_spaces:
task_summaries = [curr_summary]
curr_left_spaces = left_spaces
else:
assert (left_spaces == curr_left_spaces)
task_summaries.pop()
task_summaries.append(curr_summary)
else:
task_name = current_cell.value.strip()
# prints correct task_summaries list here
print(task_name, task_summaries)
tasks[task_name] = task_summaries
i += 1
current_cell = self.sheet[f'A{i}']
for name, summary in tasks.items():
print(name, summary) # summary is incorrect here
预期结果:
Task 1 ['First Summary']
Task 2 ['First Summary', 'Nested Summary 1']
Task 3 ['First Summary', 'Nested Summary 1']
Task 4 ['First Summary', 'Nested Summary 2']
Task 5 ['Second Summary']
Task 6 ['Second Summary']
Task 1 ['First Summary']
Task 2 ['First Summary', 'Nested Summary 1']
Task 3 ['First Summary', 'Nested Summary 1']
Task 4 ['First Summary', 'Nested Summary 2']
Task 5 ['Second Summary']
Task 6 ['Second Summary']
实际结果:
Task 1 ['First Summary']
Task 2 ['First Summary', 'Nested Summary 1']
Task 3 ['First Summary', 'Nested Summary 1']
Task 4 ['First Summary', 'Nested Summary 2']
Task 5 ['Second Summary']
Task 6 ['Second Summary']
Task 1 ['First Summary', 'Nested Summary 2']
Task 2 ['First Summary', 'Nested Summary 2']
Task 3 ['First Summary', 'Nested Summary 2']
Task 4 ['First Summary', 'Nested Summary 2']
Task 5 ['Second Summary']
Task 6 ['Second Summary']
答案 0 :(得分:2)
您的问题是,您对所有条目使用相同的task_summaries
列表,并将新任务添加到字典中,并且它们的值引用了同一列表。
因此,最后所有条目的值都是列表['First Summary', 'Nested Summary 2']
,然后才在任务5中执行task_summaries = [curr_summary]
,它为task_summaries
创建了一个新对象,现在是最后一个两个任务引用了相同的列表。
您需要做的是为每个条目提供一个新列表,因此请更改此行:
tasks[task_name] = task_summaries
收件人:
tasks[task_name] = list(task_summaries)
一个简单的示例来演示:
>>> l = [1, 2]
>>> d = {}
>>> d['a'] = l # 'a' gets a reference to l
>>> l[0] = 3 # so that changes 'a's value too
>>> print(l)
[3, 2]
>>> print(d)
{'a', [3, 2]}
>>> d['a'] = list(l) # now 'a' gets a new copy of l
>>> l[0] = 4 # so that shouldn't affect him
>>> print(l)
[4, 2]
>>> print(d)
{'a', [3, 2]}