Question

我创建了一个数据框，用于测试重采样功能，如下所示：

 for (Object vendorThread : vendorDetails) {
            String thread = (String) vendorThread;
            //timeout = details.getTimeout();
            Runnable worker = null;
            try {
                Class c = Class.forName(thread);
                Constructor<?> cons = c.getConstructor(SearchRequest.class, Results.class);
                worker = (Runnable) cons.newInstance(searchRequest, results);
            } catch (Exception e) {
                //e.printStackTrace();
            }
            if (worker == null) {
                System.out.println("------------------------ WORKER IS NULL ---------------");
            }
            executor.execute(worker);
        }

然后我使用这样的重新采样：

f = pd.DataFrame(data=np.linspace(50, 100, 200), index=pd.date_range(end='2014-06-18', periods=200), columns=['last'])
f
Out[63]:
last
2013-12-01  50.000000
2013-12-02  50.251256
2013-12-03  50.502513

然后，我想在满足特定条件时使用np.where创建新列：

f_d1_resamp = f.resample('1w')

但是我收到以下错误：

f_d1_resamp['Gap'] = np.where(f['last'] > f['last'].shift(),(f["last"].shift() - f["last"]),'');

如何修改我的np.where代码以停止此错误？我需要使用np.where，因为这将用于我的其他项目。感谢

Answer 1

您需要添加一些aggragate函数，例如mean，sum，因为版本0.18.0已更改API并返回Resampler - 请参阅{{3} }：

f_d1_resamp = f.resample('1w').sum()

然后你可以使用numpy.where - 我认为更好的方法是使用新的f_d1_resamp代替f，因为可以进行向下或向上采样：

f_d1_resamp['Gap'] = np.where(f_d1_resamp['last'] > f_d1_resamp['last'].shift(), 
                              f_d1_resamp["last"].diff(),'');
print (f_d1_resamp)
                  last                 Gap
2013-12-01   50.000000                    
2013-12-08  357.035176    307.035175879397
2013-12-15  369.346734  12.311557788944754
2013-12-22  381.658291  12.311557788944697
2013-12-29  393.969849  12.311557788944697
2014-01-05  406.281407  12.311557788944754
2014-01-12  418.592965  12.311557788944697
2014-01-19  430.904523  12.311557788944697
2014-01-26  443.216080   12.31155778894481
2014-02-02  455.527638  12.311557788944697
2014-02-09  467.839196   12.31155778894464
...
...
...

Answer 2

第一个问题是你的重新采样对象只是一个重新采样器，你需要调用一些像mean那样的聚合函数，其次，即使要修改你的行：

f_d1_resamp['Gap'] = np.where(f['last'] > f['last'].shift(),(f["last"].shift() - f["last"]),'')

提出ValueError：

ValueError: Length of values does not match length of index

为此，您可以直接在所需结果上使用where，而不是使用np.where：

In [305]:
f_d1_resamp = f.resample('1w').mean()
f_d1_resamp['Gap'] = f['last'].diff().where(f['last'] > f['last'].shift(),'')
f_d1_resamp

Out[305]:
                 last       Gap
2013-12-01  50.000000          
2013-12-08  51.005025  0.251256
2013-12-15  52.763819  0.251256
2013-12-22  54.522613  0.251256
2013-12-29  56.281407  0.251256
2014-01-05  58.040201  0.251256
2014-01-12  59.798995  0.251256
2014-01-19  61.557789  0.251256
2014-01-26  63.316583  0.251256
2014-02-02  65.075377  0.251256
2014-02-09  66.834171  0.251256
2014-02-16  68.592965  0.251256
2014-02-23  70.351759  0.251256
2014-03-02  72.110553  0.251256
2014-03-09  73.869347  0.251256
2014-03-16  75.628141  0.251256
2014-03-23  77.386935  0.251256
2014-03-30  79.145729  0.251256
2014-04-06  80.904523  0.251256
2014-04-13  82.663317  0.251256
2014-04-20  84.422111  0.251256
2014-04-27  86.180905  0.251256
2014-05-04  87.939698  0.251256
2014-05-11  89.698492  0.251256
2014-05-18  91.457286  0.251256
2014-05-25  93.216080  0.251256
2014-06-01  94.974874  0.251256
2014-06-08  96.733668  0.251256
2014-06-15  98.492462  0.251256
2014-06-22  99.748744       NaN

np.where代码 - ValueError：“无法在DatetimeIndexResampler上设置项目”

2 个答案: