我在熊猫df_res中有以下数据帧,我们可以用几行来调用它,如图所示。
!apt-get install openjdk-8-jdk-headless -qq > /dev/null
!wget -q http://mirrors.viethosting.com/apache/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz
!tar xf spark-2.4.4-bin-hadoop2.7.tgz
!pip install -q findspark
我想绘制一个图表,其中x轴是从2019年6月到2019年10月的日期。y轴是在给定日期中收件人收到电子邮件的次数的计数。因此,我想制作一个堆叠的条形图,其中每个堆叠都是收件人。例如:在“ 2019-08-30”日期,有2封邮件“ building308@list.com”,有1封邮件“ notification @ listcom”。所以那天我将有两叠。我还希望能够提供一些我感兴趣的电子邮件计数的收件人的列表。我编写了以下代码以进行相同的操作,但是我在努力获取收件人数量的方法上感到困惑。相反,如果计数是列表或列,我可以绘制,但对我来说却无济于事,因为数据框很大,有上千行。
date time weekday team recipient
2019-08-30 14:49:22 Friday team1 building308@list.com
2019-08-30 05:57:51 Friday team1 notification@listcom
2019-08-29 22:54:58 Thursday team1 robert.r@gmail.com
2019-08-29 22:54:58 Thursday team1 emcor@list.com
2019-08-30 06:26:12 Friday team1 building308@list.com
2019-09-05 14:16:22 Thursday team1 pqr@xyz.com
2019-09-05 14:16:22 Thursday team1 flash_hvac@list.com
2019-09-04 22:54:59 Wednesday team1 robert.r@gmail.com
2019-09-04 22:54:59 Wednesday team1 emcor@list.com
已更新以添加代码:2019/10/11 #绘制值
import plotly.graph_objects as go
import datetime
x = [datetime.datetime(year=2019, month=06, day=4),
datetime.datetime(year=2019, month=11, day=5),
datetime.datetime(year=2019, month=13, day=6)]
y = [2, 2, 5]
fig = go.Figure(data=[go.Bar(x=x, y=y)])
# Use datetime objects to set xaxis range
fig.update_layout(xaxis_range=[datetime.datetime(2019, 06, 17),
datetime.datetime(2019, 10, 7)])
fig.show()
这是将值传递到y轴的方式吗?
答案 0 :(得分:2)
date =['2019-08-30 14:49:22',
'2019-08-30 05:57:51',
'2019-08-29 22:54:58',
'2019-08-29 22:54:58',
]
rec = ['building308@list.com','building308@list.com','emcor@list.com','pqr@xyz.com']
data = pd.DataFrame({
'date':date,
'recipient':rec
})
data
date recipient date_extract
0 2019-08-30 14:49:22 building308@list.com 2019-08-30
1 2019-08-30 05:57:51 building308@list.com 2019-08-30
2 2019-08-29 22:54:58 emcor@list.com 2019-08-29
3 2019-08-29 22:54:58 pqr@xyz.com 2019-08-29
# Extract date from date_timestamp
data['date'] = pd.to_datetime(data['date'])
data = data.assign(date_extract = [str(i.date()) for i in data['date']])
new_data = data.groupby(by=['date_extract','recipient']).size()
print(new_data)
date_extract recipient
2019-08-29 emcor@list.com 1
pqr@xyz.com 1
2019-08-30 building308@list.com 2
由于new_data是系列,您可以直接将其直接传递给x或y参数。
recipient_frequency = data['recipient'].value_counts()
# Recipient distribution
recipient_frequency
Out[]:
building308@list.com 2
emcor@list.com 1
pqr@xyz.com 1
Name: recipient, dtype: int64
# Total recipient count
len(recipient_frequency)
Out[]: 3
如果您对plotly-plots有疑问,请在此处参考一些教程:https://github.com/SayaliSonawane/Plotly_Offline_Python