Question

我有以下数据：

datetime    price
2017-10-02 08:03:00 12877
2017-10-02 08:04:00 12877.5
2017-10-02 08:05:00 12879
2017-10-02 08:06:00 12875.5
2017-10-02 08:07:00 12875.5
2017-10-02 08:08:00 12878
2017-10-02 08:09:00 12878
2017-10-02 08:10:00 12878
2017-10-02 08:11:00 12881
2017-10-02 08:12:00 12882.5
2017-10-02 08:13:00 12884.5
2017-10-02 08:14:00 12882
2017-10-02 08:15:00 12880.5
2017-10-02 08:16:00 12881.5
2017-10-02 08:17:00 12879
2017-10-02 08:18:00 12879
2017-10-02 08:19:00 12880
2017-10-02 08:20:00 12878.5

我想找到分钟。 'datetime'的范围价格（范围由windows_size定义，可以是1/2/3等）使用：

df['MinPrice'] = df.ix[window_size:,'price']

它给出了窗口最后一行或使用

的价格

df['MinPrice'] = df.ix[window_size:,'price'].min()

给出了所有列的最小值。

请建议如何获得分钟。窗口声明的特定行的值。

编辑：预期结果如下：如果窗口大小为3，我想获得最小值。值为3行。所以在08:05:00我会得到12877而在08:06:00我会得到12875.5

Answer 1

查看pandas.DataFrame.rolling

import java.util.*;
class Config{
int[] tabela;


Config(){
int[] blanks ={1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,0};
tabela = blanks;
}
public Config(int arraydado[]){
tabela = arraydado;}
 public void printTabela(){
for(int i = 0; i<16 ;i++)
    System.out.print(tabela[i] + " ");
System.out.println();
}
}
    public static Config move_left_new(Config dada){
    int i;
    int temp;
    Config resultante = new Config(dada.tabela);
    for(i = 0; i<16; i++){
    if(resultante.tabela[i] == 0)
        break;
    }
    if( i!=0 && i!= 4 && i!= 8 && i!=12){
    temp = resultante.tabela[i-1];
    resultante.tabela[i-1] = 0;
    resultante.tabela[i] = temp;
    }
    return resultante;
}

public static void main(String[] args){
Scanner input = new Scanner(System.in);
int arr[] = new int[16];
for(int i=0; i<16; i++)
arr[i] = input.nextInt();

Config randomconfig = new Config(arr);
randomconfig.printTabela(); //original
Config changed = move_left_new(randomconfig);
randomconfig.printTabela(); //should be the same as before but isnt
changed.printTabela(); // moved as it should
}

将给出预期的结果：

df.rolling(window=3).apply(min).dropna()

Answer 2

由于您看起来间隔为1分钟，因此您可能希望利用resample，这样您就可以使用日期时间来定义窗口

df.resample('3T',on='datetime').min()

                             datetime    price
datetime                                        
2017-10-02 08:03:00 2017-10-02 08:03:00  12877.0
2017-10-02 08:06:00 2017-10-02 08:06:00  12875.5
2017-10-02 08:09:00 2017-10-02 08:09:00  12878.0
2017-10-02 08:12:00 2017-10-02 08:12:00  12882.0
2017-10-02 08:15:00 2017-10-02 08:15:00  12879.0
2017-10-02 08:18:00 2017-10-02 08:18:00  12878.5

要将值设置回初始数据帧，请使用transform

df['minPrice'] = df.resample('3T',on='datetime').transform('min')

             datetime    price  minPrice
0  2017-10-02 08:03:00  12877.0   12877.0
1  2017-10-02 08:04:00  12877.5   12877.0
2  2017-10-02 08:05:00  12879.0   12877.0
3  2017-10-02 08:06:00  12875.5   12875.5
4  2017-10-02 08:07:00  12875.5   12875.5
5  2017-10-02 08:08:00  12878.0   12875.5
6  2017-10-02 08:09:00  12878.0   12878.0
7  2017-10-02 08:10:00  12878.0   12878.0
8  2017-10-02 08:11:00  12881.0   12878.0
9  2017-10-02 08:12:00  12882.5   12882.0
10 2017-10-02 08:13:00  12884.5   12882.0
11 2017-10-02 08:14:00  12882.0   12882.0
12 2017-10-02 08:15:00  12880.5   12879.0
13 2017-10-02 08:16:00  12881.5   12879.0
14 2017-10-02 08:17:00  12879.0   12879.0
15 2017-10-02 08:18:00  12879.0   12878.5
16 2017-10-02 08:19:00  12880.0   12878.5
17 2017-10-02 08:20:00  12878.5   12878.5

Answer 3

您可能希望保持数据框的长度相同：

df['Price_Low3'] = np.where(pd.isna(df.price.shift(periods=2)),df.price,df.price.rolling(3).min())

结果是：

            datetime    price  Price_Low3
0   02/10/2017 08:03  12877.0     12877.0
1   02/10/2017 08:04  12877.5     12877.5
2   02/10/2017 08:05  12879.0     12877.0
3   02/10/2017 08:06  12875.5     12875.5
4   02/10/2017 08:07  12875.5     12875.5
5   02/10/2017 08:08  12878.0     12875.5
6   02/10/2017 08:09  12878.0     12875.5
7   02/10/2017 08:10  12878.0     12878.0
8   02/10/2017 08:11  12881.0     12878.0
9   02/10/2017 08:12  12882.5     12878.0
10  02/10/2017 08:13  12884.5     12881.0
11  02/10/2017 08:14  12882.0     12882.0
12  02/10/2017 08:15  12880.5     12880.5
13  02/10/2017 08:16  12881.5     12880.5
14  02/10/2017 08:17  12879.0     12879.0
15  02/10/2017 08:18  12879.0     12879.0
16  02/10/2017 08:19  12880.0     12879.0
17  02/10/2017 08:20  12878.5     12878.5

找到分钟。具有pandas / python的未来行范围的特定列中的值

3 个答案: