如何使用熊猫绘制时间序列数据的直方图?

时间:2019-03-04 20:55:47

标签: python pandas matplotlib

我有一个时间序列数据,我使用pygal进行了一些绘图。这是数据的样子

[(datetime.datetime(2019, 3, 3, 0, 20, 22, 195908, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=330, name=None)), 12.0), (datetime.datetime(2019, 3, 3, 0, 20, 25, 807185, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=330, name=None)), 12.0), (datetime.datetime(2019, 3, 3, 0, 20, 29, 566157, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=330, name=None)), 12.0), (datetime.datetime(2019, 3, 3, 0, 20, 33, 57685, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=330, name=None)), 12.0), (datetime.datetime(2019, 3, 3, 0, 54, 32, 3897, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=330, name=None)), 12.0), (datetime.datetime(2019, 3, 3, 0, 54, 35, 739188, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=330, name=None)), 12.0), (datetime.datetime(2019, 3, 3, 0, 54, 39, 592752, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=330, name=None)), 12.0), (datetime.datetime(2019, 3, 3, 0, 54, 43, 242095, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=330, name=None)), 12.0), (datetime.datetime(2019, 3, 3, 0, 52, 37, 311601, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=330, name=None)), 11.0), (datetime.datetime(2019, 3, 3, 0, 52, 40, 976424, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=330, name=None)), 11.0)]

现在,我可以使用bar plot进行常规pygal了,但是现在我需要绘制一个histogram。我发现pandas可以找到histogram并使用matplotlib对其进行绘制。

这就是我所做的

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import datetime
import  psycopg2

data = [(datetime.datetime(2019, 3, 3, 0, 20, 22, 195908, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=330, name=None)), 12.0), (datetime.datetime(2019, 3, 3, 0, 20, 25, 807185, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=330, name=None)), 12.0), (datetime.datetime(2019, 3, 3, 0, 20, 29, 566157, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=330, name=None)), 12.0), (datetime.datetime(2019, 3, 3, 0, 20, 33, 57685, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=330, name=None)), 12.0), (datetime.datetime(2019, 3, 3, 0, 54, 32, 3897, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=330, name=None)), 12.0), (datetime.datetime(2019, 3, 3, 0, 54, 35, 739188, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=330, name=None)), 12.0), (datetime.datetime(2019, 3, 3, 0, 54, 39, 592752, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=330, name=None)), 12.0), (datetime.datetime(2019, 3, 3, 0, 54, 43, 242095, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=330, name=None)), 12.0), (datetime.datetime(2019, 3, 3, 0, 52, 37, 311601, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=330, name=None)), 11.0), (datetime.datetime(2019, 3, 3, 0, 52, 40, 976424, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=330, name=None)), 11.0)]

df_hist = pd.DataFrame(np.array(data)).hist(bins=5)  // I divide the data into 5 buckets
plt.savefig('hist.svg')

但是我遇到以下错误

Traceback (most recent call last):
  File "/home/souvik/Music/Test331.py", line 120, in <module>
    df_hist = pd.DataFrame(np.array(data)).hist(bins=5)
  File "/home/souvik/django_test/webdev/lib/python3.5/site-packages/pandas/plotting/_core.py", line 2408, in hist_frame
    layout=layout)
  File "/home/souvik/django_test/webdev/lib/python3.5/site-packages/pandas/plotting/_tools.py", line 238, in _subplots
    ax0 = fig.add_subplot(nrows, ncols, 1, **subplot_kw)
  File "/home/souvik/django_test/webdev/lib/python3.5/site-packages/matplotlib/figure.py", line 1367, in add_subplot
    a = subplot_class_factory(projection_class)(self, *args, **kwargs)
  File "/home/souvik/django_test/webdev/lib/python3.5/site-packages/matplotlib/axes/_subplots.py", line 60, in __init__
    ).format(maxn=rows*cols, num=num))
ValueError: num must be 1 <= num <= 0, not 1

但是,如果我包括数字值并从数据列表中排除datetime值,那么我得到了图

data = [x[1] for x in data]
df_hist = pd.DataFrame(np.array(data)).hist(bins=5)
plt.savefig('hist.svg')

enter image description here

现在在x轴上,我想要时间序列范围,因此我知道在特定时间间隔内的数据频率。但是,当我包含原始数据时,出现了如上所述的错误。

如何获取histogram的时间序列数据?另外,我可以不用matplotlib来使用pygal吗?

注意:上图是一个更大的数据。我减少了数据量,以便在此处发布问题。

0 个答案:

没有答案