Question

我是python的新手，因此需要帮助。我有一个三个列表，其中的值相互对应。就像下面的excel一样

ID           Name         Height 
1             u              5
2             s              7
3             d              9
4             u              7
5             k              7
6             z              5

以此类推。

现在从此表中我想合并具有相同高度的ID。名称并不那么重要。如何使用python做到这一点。

Answer 1

要获得更好的答案，请查看格式化工具，使内容更易于阅读。

ID Name Height 
1  u    5 
2  s    7
3  d    9
...

执行所需操作的最简单方法是从“ itertools”软件包中检出“ groupby”。 https://docs.python.org/3/library/itertools.html

首先，假设所有列表的长度相等，请将它们放到合并的列表中。

newList = [(a,b,c) for a,b,c in zip(list1, list2, list3)]

newList看起来像这样：

[(1,u,5),(2,s,7),(3,d,9),(4,u,7),...]

现在，您可以使用groupby按高度对所有内容进行分组。

from itertools import groupby

data = sorted(newlist, key=lambda x: x[2])  # Sort "newlist" by height values
for k,g in groupby(data, lambda x: x[2])
    # k is now the height value
    group = list(g)  # group will now have all of the matching list elements with the same height
    ids = [x[1] for x in group]  # This would be the list of ids with the height from k...

Answer 2

您可以尝试使用python中的pandas模块并使用groupby函数。参见下面的示例。

import pandas as pd

id1 = [1203,1204,1205,1206,1207,1208]    #list of id's
name = ['john','mike','henry','cart','rob','sam']    #list of names
height = [5,4,5,7,2,4]    #list of heights

df = pd.DataFrame({'id':id1, 'name':name, 'height':height})    #creating a dataframe from id, name and height lists

df2 = df.groupby('height').apply(lambda x: x['id'].unique())    #grouping the id's having same height

print(df2)

输出：

height
2          [1207]
4    [1204, 1208]
5    [1203, 1205]
7          [1206]
dtype: object

此外，如果您不打扰名称列，则可以只使用具有id和height列的defaultdict来做到这一点。参见下面的示例。

from collections import defaultdict

id1 = [1203,1204,1205,1206,1207,1208]  #list of id's
height = [5,4,5,7,2,4]  #list of heights

data = dict(zip(id1,height))  #creating a normal dictionary with id's and height

result = defaultdict()  #creating a default dictionary

for key,value in data.items():
    if value in result.keys():
        result[value].append(key)
    else:
        result[value] = [key]

print(result)

Answer 3

首先，欢迎使用StackOverflow！

我相信您要问的是一种提取Excel电子表格内容并使用该内容查找和组合具有相同高度的ID号的方法。

为此，您需要一种以Python读取Excel电子表格的方法。您可以通过两种方法执行此操作：

将电子表格转换为CSV（逗号分隔值）文件，Python可以使用 csv 模块轻松使用该文件。
使用一个外部库来帮助您直接读取和写入Excel电子表格，例如 xlrd 和 xlwt 。顾名思义， xlrd 是一个模块，可让您从Excel电子表格中读取数据，而 xlwt 则可让您写入Excel电子表格中。

假设您只需要从电子表格中读取数据，我将分享使用xlrd模块的方法。

首先，使用以下命令安装xlrd模块：

pip install xlrd

然后，在您的Python程序中，导入xlrd模块并按如下所示打开电子表格：

workbook = xlrd.open_workbook('spreadsheet_file.xls')

如果文件很大，则将ondemand选项与上述语句一起使用，如下所示：

workbook = xlrd.open_workbook('spreadsheet_file.xls', on_demand = True)

假定电子表格是Excel工作簿中的第一个电子表格，请按以下方式打开它：

worksheet = workbook.sheet_by_index(0)

这将打开Excel工作簿中的第一个电子表格。

现在，要访问电子表格中的数据，您需要执行以下操作：

value = worksheet.cell(row_index, column_index).value

因此，从理论上讲，您的解决方案将大致与此类似：

import xlrd

book = xlrd.open_workbook('your_spreadsheet_file.xls')
sheet = workbook.sheet_by_index(0)
list_values = list()

row_ind = 0
while sheet.cell(row_ind, 0).value != xlrd.empty_cell.value:
    list_values.append((
      sheet.cell(row_ind, 0).value,
      sheet.cell(row_ind, 1).value,
      sheet.cell(row_ind, 2).value))
    row_ind += 1

现在，您将有一个元组列表，可用于执行您喜欢的任何事情。此后，如果您确实想按高度将它们分组，请参考user1209675的答案。

合并具有相应值的列表

3 个答案: