熊猫-如何对一周中每一天的每一小时进行分组和绘图

时间:2019-02-24 21:26:31

标签: python pandas matplotlib

我需要帮助弄清楚如何绘制子图以便从显示的数据框中进行比较:

  Date                   A        B         C              
2017-03-22 15:00:00     obj1    value_a    other_1
2017-03-22 14:00:00     obj2    value_ns   other_5
2017-03-21 15:00:00     obj3    value_kdsa other_23
2014-05-08 17:00:00     obj2    value_as   other_4
2010-07-01 20:00:00     obj1    value_as   other_0

我正在尝试绘制一周中每一天每一小时的发生情况。因此,请计算一周和每小时中每一天的发生次数,并将它们绘制在如下图所示的子图中。

enter image description here

如果这个问题听起来令人困惑,请让我知道是否有任何问题。谢谢。

2 个答案:

答案 0 :(得分:3)

您可以使用多个<body> <div id="upper">This one is above the position:absolute one</div> <div style="position: relative"> <!-- this is needed for position:absolute below to put the div under "upper" -- or so I think --> <div class="float-over-content"> <!-- I WANT TO DEFINE THE MAX-HEIGHT OF THIS DIV SUCH THAT IF IT REACHES THE BOTTOM OF THE VIEWPORT, A SCROLL BAR SHOULD APPEAR: (AS OPPOSED TO NOW, WHEN ITS HEIGHT REACHES 100px) --> Make this reach exactly to the bottom<br/> <!-- X times... --> Make this reach exactly to the bottom<br/> </div> </div> <div id="lower"> This one is "behind" the position:absolute one (it partially covers this one) </div> </body>完成此操作。由于我们知道一周中有7天,因此我们可以指定该面板数。如果您groupby,则可以将组索引用作子图轴的索引:

样本数据

groupby(df.Date.dt.dayofweek)

代码:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

n = 10000
np.random.seed(123)
df = pd.DataFrame({'Date': pd.date_range('2010-01-01', freq='1.09min', periods=n),
                   'A': np.random.randint(1,10,n),
                   'B': np.random.normal(0,1,n)})

enter image description here


如果您想使长宽比不那么极端,请考虑绘制4x2网格。一旦我们fig, ax = plt.subplots(ncols=7, figsize=(30,5)) plt.subplots_adjust(wspace=0.05) #Remove some whitespace between subplots for idx, gp in df.groupby(df.Date.dt.dayofweek): ax[idx].set_title(gp.Date.dt.day_name().iloc[0]) #Set title to the weekday (gp.groupby(gp.Date.dt.hour).size().rename_axis('Tweet Hour').to_frame('') .reindex(np.arange(0,24,1)).fillna(0) .plot(kind='bar', ax=ax[idx], rot=0, ec='k', legend=False)) # Ticks and labels on leftmost only if idx == 0: _ = ax[idx].set_ylabel('Counts', fontsize=11) _ = ax[idx].tick_params(axis='both', which='major', labelsize=7, labelleft=(idx == 0), left=(idx == 0)) # Consistent bounds between subplots. lb, ub = list(zip(*[axis.get_ylim() for axis in ax])) for axis in ax: axis.set_ylim(min(lb), max(ub)) plt.show() 轴阵列,它与上面的图非常相似。有一个整数除法可以找出哪个flatten需要标签。

axes

enter image description here

答案 1 :(得分:2)

使用seaborn怎么样? sns.FacetGrid是为此而设计的:

import pandas as pd
import seaborn as sns

# make some data
date = pd.date_range('today', periods=100, freq='2.5H')

# put in dataframe
df = pd.DataFrame({
    'date' : date
})

# create day_of_week and hour columns
df['dow'] = df.date.dt.day_name()
df['hour'] = df.date.dt.hour

# create facet grid
g = sns.FacetGrid(data=df.groupby([
    'dow',
    'hour'
]).hour.count().to_frame(name='day_hour_count').reset_index(), col='dow', col_order=[
    'Sunday',
    'Monday',
    'Tuesday',
    'Wednesday',
    'Thursday',
    'Friday',
    'Saturday'
], col_wrap=3)

# map barplot to each subplot
g.map(sns.barplot, 'hour', 'day_hour_count');

barplots