Question

我有一张excel表，看起来像这样：

Column1 Column2 Column3
0       23      1
1       5       2
1       2       3
1       19      5
2       56      1
2       22      2
3       2       4
3       14      5
4       59      1
5       44      1
5       1       2
5       87      3

我正在寻找提取数据，将其按第1列分组，然后将其添加到字典中，使其显示如下：

{0: [1],
1: [2,3,5],
2: [1,2],
3: [4,5],
4: [1],
5: [1,2,3]}

这是我目前的代码

excel = pandas.read_excel(r"e:\test_data.xlsx", sheetname='mySheet', parse_cols'A,C')
myTable = excel.groupby("Column1").groups
print myTable

但是，我的输出如下：

{0: [0L], 1: [1L, 2L, 3L], 2: [4L, 5L], 3: [6L, 7L], 4: [8L], 5: [9L, 10L, 11L]}

谢谢！

Answer 1

您可以在groupby上Column1然后将Column3带到apply(list)并致电to_dict吗？

In [81]: df.groupby('Column1')['Column3'].apply(list).to_dict()
Out[81]: {0: [1], 1: [2, 3, 5], 2: [1, 2], 3: [4, 5], 4: [1], 5: [1, 2, 3]}

或者，做

In [433]: {k: list(v) for k, v in df.groupby('Column1')['Column3']}
Out[433]: {0: [1], 1: [2, 3, 5], 2: [1, 2], 3: [4, 5], 4: [1], 5: [1, 2, 3]}

Answer 2

根据the docs，GroupBy.groups：

是一个字典，其键是计算出的唯一组和对应的值是属于每个组的轴标签。

如果您想要自己的值，可以groupby'Column1'然后调用apply并传递list方法以应用于每个组。

然后您可以根据需要将其转换为字典：

In [5]:

dict(df.groupby('Column1')['Column3'].apply(list))
Out[5]:
{0: [1], 1: [2, 3, 5], 2: [1, 2], 3: [4, 5], 4: [1], 5: [1, 2, 3]}

（注意：请查看this SO question，了解L后跟数字的原因

GroupBy结果列表的列表

2 个答案: