我有一个这样的表:
Car Type | Color | ID
VW | Blue | 123
VW | Red | 567
VW | Black | 779
-----------------------
AUDI | Silver | 112
AUDI | Black | 356
AUDI | White | 224
我如何得到这样的东西?每行包含每种车型的颜色计数?
Car Type | Color | ID | Total
VW | Blue | 123 | 3
VW | Red | 567 | 3
VW | Black | 779 | 3
-----------------------
AUDI | Silver | 112 | 3
AUDI | Black | 356 | 3
AUDI | White | 224 | 3
干杯...
答案 0 :(得分:2)
用于每个组的唯一值数量使用GroupBy.transform
和DataFrameGroupBy.nunique
:
df['Total'] = df.groupby('Car Type')['Color'].transform('nunique')
用于每个组的计数值请使用DataFrameGroupBy.size
:
df['Total'] = df.groupby('Car Type')['Color'].transform('size')
一个值已更改的差异:
df['Total_uniq'] = df.groupby('Car Type')['Color'].transform('nunique')
df['Total_size'] = df.groupby('Car Type')['Color'].transform('size')
print (df)
Car Type Color ID Total_uniq Total_size
0 VW Blue 123 2 3
1 VW Blue 567 2 3 <- set value to Blue
2 VW Black 779 2 3
3 AUDI Silver 112 3 3
4 AUDI Black 356 3 3
5 AUDI White 224 3 3
答案 1 :(得分:0)
这是另一个类似于杰斯雷尔的选择,他击败了我!
import pandas as pd
a = {'Car type':['VW','VW','VW','AUDI','AUDI','AUDI'],'Color':['Blue','Red','Black','Silver','Black','White'],'ID':[123,567,779,112,356,224]}
df = pd.DataFrame(a)
print(df)
df_a = df.merge(df.groupby(['Car type'],as_index=False).agg({'Color':'nunique'}),how='left',on='Car type').rename(columns={'Color_x':'Color','Color_y':'Unique_colors'})
输出:
Car type Color ID Unique_colors
0 VW Blue 123 3
1 VW Red 567 3
2 VW Black 779 3
3 AUDI Silver 112 3
4 AUDI Black 356 3
5 AUDI White 224 3