这个SQL查询的Pandas等价物是什么:
select column1,
sum(column2) as A,
count(distinct column3) as B,
sum(column2) / count(distinct column3) as C
from table1
group by column1
感谢您提供任何帮助!!
答案 0 :(得分:0)
我不确定sum(column2) / count(distinct column3) as C
部分可以在同一个步骤中完成,但您可以通过两个步骤轻松完成:
演示:
In [47]: df = pd.DataFrame(np.random.randint(0,5,size=(15, 3)), columns=['c1','c2','c3'])
In [48]: df
Out[48]:
c1 c2 c3
0 4 0 3
1 2 3 2
2 1 2 3
3 3 3 0
4 1 0 4
5 1 1 1
6 2 3 3
7 2 2 2
8 4 0 0
9 1 1 0
10 1 3 0
11 4 3 1
12 0 0 3
13 3 1 0
14 4 3 1
In [49]: x = df.groupby('c1').agg({'c2':'sum', 'c3': 'nunique'}).reset_index().rename(columns={'c2':'A', 'c3':'B'})
In [50]: x
Out[50]:
c1 A B
0 0 0 1
1 1 7 4
2 2 8 2
3 3 4 1
4 4 6 3
In [51]: x['C'] = x.A / x.B
In [52]: x
Out[52]:
c1 A B C
0 0 0 1 0.00
1 1 7 4 1.75
2 2 8 2 4.00
3 3 4 1 4.00
4 4 6 3 2.00