import numpy as np
import panda as pd
dates = pd.date_range('20161104', periods = 10)
df = pd.DataFrame(np.random.randn(10, 4), index = dates, columns = list('ABCD'))
当C大于0时,我试图找到A的平均值,这意味着如果相应的C小于0,我将不计算A进入我的计算。
有没有人知道如何在不创建新数据集或使用groupby的情况下制作它?谢谢!
答案 0 :(得分:0)
你可以试试这个:
df[df.C> 0]['A'].mean()
答案 1 :(得分:0)
我不确定究竟需要什么:
np.random.seed(1)
dates = pd.date_range('20161104', periods = 10)
df = pd.DataFrame(np.random.randn(10, 4), index = dates, columns = list('ABCD'))
print (df)
A B C D
2016-11-04 1.624345 -0.611756 -0.528172 -1.072969
2016-11-05 0.865408 -2.301539 1.744812 -0.761207
2016-11-06 0.319039 -0.249370 1.462108 -2.060141
2016-11-07 -0.322417 -0.384054 1.133769 -1.099891
2016-11-08 -0.172428 -0.877858 0.042214 0.582815
2016-11-09 -1.100619 1.144724 0.901591 0.502494
2016-11-10 0.900856 -0.683728 -0.122890 -0.935769
2016-11-11 -0.267888 0.530355 -0.691661 -0.396754
2016-11-12 -0.687173 -0.845206 -0.671246 -0.012665
2016-11-13 -1.117310 0.234416 1.659802 0.742044
#mean of A where C is larger than 0
print (df.ix[df.C > 0, 'A'].mean())
-0.2547213686717275
#mean of A where C is less than 0
print (df.ix[df.C < 0, 'A'].mean())
0.3925351332955095
#mean of A where C is larger than 0 and C is less than 0, co condition never return True
print (df.ix[(df.C > 0) & (df.C < 0), 'A'].mean())
nan
#mean of A where A is larger than 0 and C is less than 0
print (df.ix[(df.A > 0) & (df.C < 0), 'A'].mean())
1.2626006564638268