Question

我有一个大型数据框，包含日期，商店编号，销售单位和降雨总量。它看起来像这样......

2014-10-14

我想在降水总量大于1的任何日期周围选择一个三天的窗口。对于这个小例子，我想要回到前7行，即2014-10-14之前的3天，这三个2014-10-14之后的天数和{{1}}，因为它的降水量大于1.

Answer 1

以下两种方法可以构建选择掩码而不循环索引值：

您可以找到mask = (df['preciptotal'] > 1)大于1的行：

scipy.ndimage.binary_dilation

然后使用import scipy.ndimage as ndimage import pandas as pd df = df = pd.read_table('data', sep='\s+') mask = (df['preciptotal'] > 1) mask = ndimage.binary_dilation(mask, iterations=3) df.loc[mask]将掩码展开到7天的窗口：

         date  store_nbr  units  preciptotal
0  2014-10-11          1      0         0.00
1  2014-10-12          1      0         0.01
2  2014-10-13          1      2         0.00
3  2014-10-14          1      1         2.13
4  2014-10-15          1      0         0.00
5  2014-10-16          1      0         0.87
6  2014-10-17          1      3         0.01

产量

scipy

或者，使用NumPy（但没有mask.shift依赖关系），您可以将np.logical_and.reduce与mask = (df['preciptotal'] > 1) mask = ~np.logical_and.reduce([(~mask).shift(i) for i in range(-3, 4)]).astype(bool) # array([ True, True, True, True, True, True, True, False], dtype=bool)一起使用：

for (int x=0; x < @Model.Lenght; x += 2)
{
    @:<tr>
    @:<td style="width: 400px"><h3>@Model[x]</h3></td>
    @:<td style="width: 400px"><h3>@Model[x+1]</h3></td>
    @:</tr>
}

Answer 2

对于特定值，您可以这样做：

In [84]:

idx = df[df['preciptotal'] > 1].index[0]
df.iloc[idx-3: idx+4]
Out[84]:
        date  store_nbr  units  preciptotal
0 2014-10-11          1      0         0.00
1 2014-10-12          1      0         0.01
2 2014-10-13          1      2         0.00
3 2014-10-14          1      1         2.13
4 2014-10-15          1      0         0.00
5 2014-10-16          1      0         0.87
6 2014-10-17          1      3         0.01

对于更一般的情况，您可以获得满足条件的索引数组

idx_vals = df[df['preciptotal'] > 1].index

然后你可以生成切片或迭代数组值：

for idx in idx_values:
    df.iloc[idx-3: idx+4]

这假设您的索引是基于0的int64索引，您的样本是

根据日期Pandas Dataframe切片

2 个答案: