在数据框中使用group by时,我可以将特定列的结果收集为列表吗?
我不确定这个细节在这里有意义但是,在PostgreSQL中有一个函数array_agg(columnname)
来实现同样的目的。
此外,我尝试在API文档中查找详细信息,但未尝试成功。
train
Out[6]:
TripType VisitNumber Weekday ScanCount DepartmentDescription
1 30 7 Friday 1 SHOES
2 30 7 Friday 1 PERSONAL CARE
3 26 8 Friday 2 PAINT AND ACCESSORIES
4 26 8 Friday 2 PAINT AND ACCESSORIES
5 26 8 Friday 2 PAINT AND ACCESSORIES
6 26 8 Friday 1 PAINT AND ACCESSORIES
7 26 8 Friday 1 PAINT AND ACCESSORIES
8 26 8 Friday 1 PAINT AND ACCESSORIES
9 26 8 Friday -1 PAINT AND ACCESSORIES
10 26 8 Friday 1 DSD GROCERY
11 26 8 Friday 2 PAINT AND ACCESSORIES
12 26 8 Friday 1 MEAT - FRESH & FROZEN
13 26 8 Friday 1 PAINT AND ACCESSORIES
14 26 8 Friday -1 PAINT AND ACCESSORIES
15 26 8 Friday 2 PAINT AND ACCESSORIES
16 26 8 Friday 1 PAINT AND ACCESSORIES
17 26 8 Friday 1 PAINT AND ACCESSORIES
18 26 8 Friday 2 DAIRY
19 26 8 Friday 1 PETS AND SUPPLIES
train.groupby(['VisitNumber','Weekday','TripType']).count()
Out[7]:
ScanCount DepartmentDescription
VisitNumber Weekday TripType
7 Friday 30 2 2
8 Friday 26 17 17
我的意思是第一个分组行的结果如下所示
ScanCount DepartmentDescription
VisitNumber Weekday TripType
7 Friday 30 2 [SHOES,PERSONAL CARE]
数据集:
{'DepartmentDescription': {1: 'SHOES',
2: 'PERSONAL CARE',
3: 'PAINT AND ACCESSORIES',
4: 'PAINT AND ACCESSORIES',
5: 'PAINT AND ACCESSORIES'},
'ScanCount': {1: 1, 2: 1, 3: 2, 4: 2, 5: 2},
'TripType': {1: 30, 2: 30, 3: 26, 4: 26, 5: 26},
'VisitNumber': {1: 7, 2: 7, 3: 8, 4: 8, 5: 8},
'Weekday': {1: 'Friday', 2: 'Friday', 3: 'Friday', 4: 'Friday', 5: 'Friday'}}
答案 0 :(得分:1)
IIUC你想要以下内容:
In [248]:
df.groupby(['VisitNumber','Weekday','TripType'])['DepartmentDescription'].apply(list)
Out[248]:
VisitNumber Weekday TripType
7 Friday 30 [SHOES, PERSONAL CARE]
8 Friday 26 [PAINT AND ACCESSORIES, PAINT AND ACCESSORIES,...
Name: DepartmentDescription, dtype: object