Question

我有一个每小时的时间序列数据（例如带有日期/时间和值列的df），我想在其中：

第1步：删除每天的前5个百分点

第2步：获取每天的最大值（第1步）

第3步：获取每个月的平均值（第2步）

这是我尝试实现上述逻辑的内容：

step_1 = df.resample('D').apply(lambda x: x<x.quantile(0.95))
step_2 = step_1.resample('D').max()
step_3 = step_2.resample('M').mean()

即使我没有收到任何代码错误，但基于上述三个步骤（我总是得到一个恒定值），生成的输出仍与预期结果不同

任何帮助将不胜感激。

Answer 1

您快到了。您的step_1是一系列布尔值，具有与原始数据相同的索引，您可以使用它来过滤DataFrame，因此：

step_1 = df.resample('D').apply(lambda x: x<x.quantile(0.95))
step_2 = df[step_1].resample('D').max()
step_3 = step_2.resample('M').mean()

Answer 2

您的第一步是布尔掩码，因此您需要添加一个附加步骤：

async Task<Player> getPlayerAsync(string path)
{
    Player player= null;
    HttpResponseMessage response = await client.GetAsync(path);
    if (response.IsSuccessStatusCode)
    {
        player = await response.Content.ReadAsAsync<Player>();
    }
    return player;
}
getPlayerAsync("https://lichess.org/player/top/200/bullet");

删除前x个百分位数数据后，对时间序列重新采样

2 个答案: