熊猫:按小时和月份查找数据框的平均值

时间:2020-10-16 04:35:05

标签: python pandas time mean linux-df

假设我有一个df:

timestamp             value1     value2
01-01-2010 00:00:00       10          5
30-01-2019 00:00:00        5          1
01-02-2015 12:00:00        1          0
25-02-2007 05:00:00       10         10
01-02-2015 05:00:00       10          1

我想根据仅基于数据集的小时和月份的“值1”和“值2”列的平均值绘制时间序列图。所需的df和图表可能看起来像这样:

hour-month     value1   value2
00-01             7.5        3
05-02              10      5.5
12-02               1        0

Time series chart

我是Python的新手。请指教

1 个答案:

答案 0 :(得分:0)

首先通过to_datetime将列转换为日期时间,然后将driver.get("https://www.allegro.pl"); WebElement categoryCombo = driver.findElement(By.xpath("//div//div//select")); Select categorySelect = new Select(categoryCombo); categorySelect.selectByIndex(3); driver.manage().window().maximize(); driver.findElement(By.xpath("/html/body/div[2]/div[8]/div/div[2]/div/div[2]/button[2]")).click(); WebElement inputField =driver.findElement(By.xpath("//input[@name='string']")); inputField.sendKeys("mavic mini"); inputField.submit(); Series.dt.strftime聚合以将日期时间转换为mean字符串,最后以DataFrame.plot进行绘图:

HH-mm

编辑:

df['timestamp'] = pd.to_datetime(df['timestamp'], dayfirst=True)

df1 = df.groupby(df['timestamp'].dt.strftime('%H-%m')).mean()

print (df1)
           value1  value2
timestamp                
00-01         7.5     3.0
05-02        10.0     5.5
12-02         1.0     0.0

df1.plot()

df['timestamp'] = pd.to_datetime(df['timestamp'], dayfirst=True)

df1 = df.groupby(df['timestamp'].map(lambda x: x.replace(year=2020, day=1))).mean()

print (df1)
                     value1  value2
timestamp                          
2020-01-01 00:00:00     7.5     3.0
2020-02-01 05:00:00    10.0     5.5
2020-02-01 12:00:00     1.0     0.0

df2 = df1.rename_axis('col', axis=1).stack().reset_index(name='vals')
print (df2)
            timestamp     col  vals
0 2020-01-01 00:00:00  value1   7.5
1 2020-01-01 00:00:00  value2   3.0
2 2020-02-01 05:00:00  value1  10.0
3 2020-02-01 05:00:00  value2   5.5
4 2020-02-01 12:00:00  value1   1.0
5 2020-02-01 12:00:00  value2   0.0