1)例如,我有3列,如下所示
date categories contents
2018-01 fish_tank1 Goldfish Gombessa Goosefish Gopher rockfish
2018-01 fish_tank2 Grass carp Goosefish Grayling mullet shark
2018-02 fish_tank2 Goosefish Gopher rockfish Grayling mullet shark
2018-01 fish_tank1 carp Goosefish Grayling Goldfish Gombessa
2018-02 fish_tank2 carp Goosefish Grayling Grass carp Goosefish
2018-03 fish_tank3 Grass carp Goosefish Grayling mullet shark
2018-03 fish_tank2 Goosefish Gopher rockfish Goosefish Grayling
2)我有点想df.groupby(['date','categories']).agg(df.contents.str.split(expand=True).stack().value_counts()
得到类似下面的结果。但最近几天我无法弄清楚。
date categories contents
2018-01 fish_tank1 2 Goldfish 2
Gombessa 2
Goosefish 2
Gopher 1
rockfish 1
......
fish_tank2 Grass 1
carp 1
.....
2018-02 fish_tank2 Goosefish 3
Grayling 2
Gopher 1
........
........................
3)谁能给我洞察力,以达到想要做的结果?
答案 0 :(得分:0)
使用-
from collections import Counter
df['contents2'] = df['contents'].str.split()
df.groupby(['date', 'categories'])['contents2'].apply(lambda x: Counter(x.sum()))
输出
date categories
2018-01 fish_tank1 Goldfish 2.0
Gombessa 2.0
Goosefish 2.0
Gopher 1.0
Grayling 1.0
carp 1.0
rockfish 1.0
fish_tank2 Goosefish 1.0
Grass 1.0
Grayling 1.0
carp 1.0
mullet 1.0
shark 1.0
2018-02 fish_tank2 Goosefish 3.0
Gopher 1.0
Grass 1.0
Grayling 2.0
carp 2.0
mullet 1.0
rockfish 1.0
shark 1.0
2018-03 fish_tank2 Goosefish 2.0
Gopher 1.0
Grayling 1.0
rockfish 1.0
fish_tank3 Goosefish 1.0
Grass 1.0
Grayling 1.0
carp 1.0
mullet 1.0
shark 1.0
Name: contents2, dtype: float64