如何使用plotly python绘制时间序列堆积的条形图

时间:2019-10-11 05:22:09

标签: python python-3.x pandas plotly-python

我在熊猫df_res中有以下数据帧,我们可以用几行来调用它,如图所示。

!apt-get install openjdk-8-jdk-headless -qq > /dev/null
!wget -q http://mirrors.viethosting.com/apache/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz
!tar xf spark-2.4.4-bin-hadoop2.7.tgz
!pip install -q findspark

我想绘制一个图表,其中x轴是从2019年6月到2019年10月的日期。y轴是在给定日期中收件人收到电子邮件的次数的计数。因此,我想制作一个堆叠的条形图,其中每个堆叠都是收件人。例如:在“ 2019-08-30”日期,有2封邮件“ building308@list.com”,有1封邮件“ notification @ listcom”。所以那天我将有两叠。我还希望能够提供一些我感兴趣的电子邮件计数的收件人的列表。我编写了以下代码以进行相同的操作,但是我在努力获取收件人数量的方法上感到困惑。相反,如果计数是列表或列,我可以绘制,但对我来说却无济于事,因为数据框很大,有上千行。

   date       time      weekday      team    recipient
 2019-08-30 14:49:22    Friday      team1   building308@list.com
 2019-08-30 05:57:51    Friday      team1   notification@listcom
 2019-08-29 22:54:58    Thursday    team1   robert.r@gmail.com
 2019-08-29 22:54:58    Thursday    team1   emcor@list.com
 2019-08-30 06:26:12    Friday      team1   building308@list.com
 2019-09-05 14:16:22    Thursday    team1   pqr@xyz.com
 2019-09-05 14:16:22    Thursday    team1   flash_hvac@list.com
 2019-09-04 22:54:59    Wednesday   team1   robert.r@gmail.com
 2019-09-04 22:54:59    Wednesday   team1   emcor@list.com

已更新以添加代码:2019/10/11     #绘制值

import plotly.graph_objects as go
import datetime


x = [datetime.datetime(year=2019, month=06, day=4),
 datetime.datetime(year=2019, month=11, day=5),
 datetime.datetime(year=2019, month=13, day=6)]
y = [2, 2, 5]

 fig = go.Figure(data=[go.Bar(x=x, y=y)])
 # Use datetime objects to set xaxis range
 fig.update_layout(xaxis_range=[datetime.datetime(2019, 06, 17),
                           datetime.datetime(2019, 10, 7)])
 fig.show()

这是将值传递到y轴的方式吗?

1 个答案:

答案 0 :(得分:2)

数据准备

date =['2019-08-30 14:49:22',
       '2019-08-30 05:57:51',
       '2019-08-29 22:54:58',
       '2019-08-29 22:54:58',
       ]
rec = ['building308@list.com','building308@list.com','emcor@list.com','pqr@xyz.com']
data = pd.DataFrame({
         'date':date,
         'recipient':rec
         })

示例数据帧:

data
               date             recipient date_extract
0 2019-08-30 14:49:22  building308@list.com   2019-08-30
1 2019-08-30 05:57:51  building308@list.com   2019-08-30
2 2019-08-29 22:54:58        emcor@list.com   2019-08-29
3 2019-08-29 22:54:58           pqr@xyz.com   2019-08-29

# Extract date from date_timestamp
data['date'] = pd.to_datetime(data['date'])
data = data.assign(date_extract = [str(i.date()) for i in data['date']])

获取电子邮件计数:每天的电子邮件总数

new_data = data.groupby(by=['date_extract','recipient']).size()
print(new_data)

date_extract  recipient           
2019-08-29    emcor@list.com          1
              pqr@xyz.com             1
2019-08-30    building308@list.com    2

由于new_data是系列,您可以直接将其直接传递给x或y参数。

总收件人

recipient_frequency = data['recipient'].value_counts()

# Recipient distribution
recipient_frequency
Out[]: 
building308@list.com    2
emcor@list.com          1
pqr@xyz.com             1
Name: recipient, dtype: int64

# Total recipient count
len(recipient_frequency)
Out[]: 3

如果您对plotly-plots有疑问,请在此处参考一些教程:https://github.com/SayaliSonawane/Plotly_Offline_Python