我创建了一个数据框,用于测试重采样功能,如下所示:
for (Object vendorThread : vendorDetails) {
String thread = (String) vendorThread;
//timeout = details.getTimeout();
Runnable worker = null;
try {
Class c = Class.forName(thread);
Constructor<?> cons = c.getConstructor(SearchRequest.class, Results.class);
worker = (Runnable) cons.newInstance(searchRequest, results);
} catch (Exception e) {
//e.printStackTrace();
}
if (worker == null) {
System.out.println("------------------------ WORKER IS NULL ---------------");
}
executor.execute(worker);
}
然后我使用这样的重新采样:
f = pd.DataFrame(data=np.linspace(50, 100, 200), index=pd.date_range(end='2014-06-18', periods=200), columns=['last'])
f
Out[63]:
last
2013-12-01 50.000000
2013-12-02 50.251256
2013-12-03 50.502513
然后,我想在满足特定条件时使用np.where创建新列:
f_d1_resamp = f.resample('1w')
但是我收到以下错误:
f_d1_resamp['Gap'] = np.where(f['last'] > f['last'].shift(),(f["last"].shift() - f["last"]),'');
如何修改我的np.where代码以停止此错误?我需要使用np.where,因为这将用于我的其他项目。感谢
答案 0 :(得分:2)
您需要添加一些aggragate函数,例如mean
,sum
,因为版本0.18.0
已更改API
并返回Resampler
- 请参阅{{3} }:
f_d1_resamp = f.resample('1w').sum()
然后你可以使用numpy.where
- 我认为更好的方法是使用新的f_d1_resamp
代替f
,因为可以进行向下或向上采样:
f_d1_resamp['Gap'] = np.where(f_d1_resamp['last'] > f_d1_resamp['last'].shift(),
f_d1_resamp["last"].diff(),'');
print (f_d1_resamp)
last Gap
2013-12-01 50.000000
2013-12-08 357.035176 307.035175879397
2013-12-15 369.346734 12.311557788944754
2013-12-22 381.658291 12.311557788944697
2013-12-29 393.969849 12.311557788944697
2014-01-05 406.281407 12.311557788944754
2014-01-12 418.592965 12.311557788944697
2014-01-19 430.904523 12.311557788944697
2014-01-26 443.216080 12.31155778894481
2014-02-02 455.527638 12.311557788944697
2014-02-09 467.839196 12.31155778894464
...
...
...
答案 1 :(得分:1)
第一个问题是你的重新采样对象只是一个重新采样器,你需要调用一些像mean
那样的聚合函数,其次,即使要修改你的行:
f_d1_resamp['Gap'] = np.where(f['last'] > f['last'].shift(),(f["last"].shift() - f["last"]),'')
提出ValueError
:
ValueError: Length of values does not match length of index
为此,您可以直接在所需结果上使用where
,而不是使用np.where
:
In [305]:
f_d1_resamp = f.resample('1w').mean()
f_d1_resamp['Gap'] = f['last'].diff().where(f['last'] > f['last'].shift(),'')
f_d1_resamp
Out[305]:
last Gap
2013-12-01 50.000000
2013-12-08 51.005025 0.251256
2013-12-15 52.763819 0.251256
2013-12-22 54.522613 0.251256
2013-12-29 56.281407 0.251256
2014-01-05 58.040201 0.251256
2014-01-12 59.798995 0.251256
2014-01-19 61.557789 0.251256
2014-01-26 63.316583 0.251256
2014-02-02 65.075377 0.251256
2014-02-09 66.834171 0.251256
2014-02-16 68.592965 0.251256
2014-02-23 70.351759 0.251256
2014-03-02 72.110553 0.251256
2014-03-09 73.869347 0.251256
2014-03-16 75.628141 0.251256
2014-03-23 77.386935 0.251256
2014-03-30 79.145729 0.251256
2014-04-06 80.904523 0.251256
2014-04-13 82.663317 0.251256
2014-04-20 84.422111 0.251256
2014-04-27 86.180905 0.251256
2014-05-04 87.939698 0.251256
2014-05-11 89.698492 0.251256
2014-05-18 91.457286 0.251256
2014-05-25 93.216080 0.251256
2014-06-01 94.974874 0.251256
2014-06-08 96.733668 0.251256
2014-06-15 98.492462 0.251256
2014-06-22 99.748744 NaN