Question

您可以在Excel电子表格中找到产品ID和位置，如下所示。

我想列出所有每个产品ID 按顺序的位置，但没有重复。

例如：

53424有Phoenix，Matsuyama，Phoenix，Matsuyama，Phoenix，Matsuyama，Phoenix。

56224有Samarinda，博伊西。汉城。等

使用Python实现它的最佳方法是什么？

我只能阅读电子表格中的单元格，但不知道有什么好处可以继续。

谢谢。

the_file = xlrd.open_workbook("C:\\excel file.xlsx")
the_sheet = the_file.sheet_by_name("Sheet1")

for row_index in range(0, the_sheet.nrows):
    product_id = the_sheet.cell(row_index, 0).value
    location = the_sheet.cell(row_index, 1).value

Answer 1

您需要使用Python的groupby()函数来删除重复项，如下所示：

from collections import defaultdict
from itertools import groupby
import xlrd

the_file = xlrd.open_workbook(r"excel file.xlsx")
the_sheet = the_file.sheet_by_name("Sheet1")
products = defaultdict(list)

for row_index in range(1, the_sheet.nrows):
    products[int(the_sheet.cell(row_index, 0).value)].append(the_sheet.cell(row_index, 1).value)

for product, v in sorted(products.items()):
    print "{} has {}.".format(product, ', '.join(k for k, g in groupby(v)))

这会使用带有字典的defaultlist()来构建您的产品。因此，字典中的每个键都包含您的产品ID，内容自动为匹配条目的列表。最后，groupby()用于读出每个原始值，并且仅为具有连续相同值的情况提供一个条目。最后，它产生的列表与它们之间的逗号连接在一起。

Answer 2

您应该使用dictionary存储来自excel的数据，然后根据产品ID进行遍历。

因此，以下代码可以帮助您 -

the_file = xlrd.open_workbook("C:\\excel file.xlsx")
the_sheet = the_file.sheet_by_name("Sheet1")

dataset = dict()

for row_index in range(0, the_sheet.nrows):
    product_id = the_sheet.cell(row_index, 0).value
    location = the_sheet.cell(row_index, 1).value
    if product_id in dataset:
        dataset[product_id].append(location)
    else:
        dataset[product_id] = [location]


for product_id in sorted(dataset.keys()):
    print "{0} has {1}.".format(product_id, ", ".join(dataset[product_id]))

上面将按照product_id（按顺序）保留位置的顺序。

Python阅读Excel电子表格，根据变量和条件创建多个列表

2 个答案: