尝试绘制和重组2个数据框

时间:2019-03-23 16:28:30

标签: python pandas

这是我的代码:

var link = document.createElement('link');
link.setAttribute('rel', 'stylesheet');
link.type = 'text/css';
link.title = csstype;
link.href = cssSheetPath1;
document.head.appendChild(link);

var link1 = document.createElement('link');
link1.setAttribute('rel', 'stylesheet');
link1.type = 'text/css';
link1.title = csstype;
link1.href = cssSheetPath2;
document.head.appendChild(link1);

所以我想在最后画出同一条图中的两条线图。

我想按天重新分组两个数据框df2 all Tag和仅显示标签的df3的发生次数。

我想在同一张图中绘制2条线,一条全天都在发生,另一条只在标签显示时发生。

目前我只设法得到这个:

enter image description here

1 个答案:

答案 0 :(得分:1)

import pandas as pd
import matplotlib.dates as mdates
import io
data="""
;Barcode;Created;Hash;Modified;Tag;Tag2
0;9780735711020;2019-02-22T22:35:06.628Z;None;2019-02-22T22:35:06.628Z;NEED_PICS;
1;3178041328890;2019-02-22T22:37:44.546Z;None;2019-02-22T22:37:44.546Z;DISPLAY;
2;8718951129597;2019-02-23T04:53:17.925Z;None;2019-02-23T04:53:17.925Z;DISPLAY;
3;3770006053078;2019-02-23T05:25:56.454Z;None;2019-02-23T05:25:56.454Z;DISPLAY;
4;3468080404892;2019-02-23T05:26:39.923Z;None;2019-02-23T05:26:39.923Z;NEED_PICS;
5;3517360013757;2019-02-23T05:27:24.910Z;None;2019-02-23T05:27:24.910Z;DISPLAY;
6;3464660000768;2019-02-23T05:27:51.379Z;None;2019-02-23T05:27:51.379Z;DISPLAY;
7;30073357;2019-02-23T06:20:53.075Z;None;2019-02-23T06:20:53.075Z;NEED_PICS;
8;02992;2019-02-23T06:22:57.326Z;None;2019-02-23T06:22:57.326Z;NEED_PICS;
9;3605532558776;2019-02-23T06:23:45.010Z;None;2019-02-23T06:23:45.010Z;NEED_PICS;
10;3605532558776;2019-02-23T06:23:48.291Z;None;2019-02-23T06:23:48.291Z;NEED_PICS;
11;3605532558776;2019-02-23T06:23:52.579Z;None;2019-02-23T06:23:52.579Z;NEED_PICS;
"""
from io import StringIO
TESTDATA = StringIO(data)

df = pd.read_csv(TESTDATA, sep=";")
df["Created"] = pd.to_datetime(df["Created"],errors='coerce').dt.date
df["Barcode"] = df["Barcode"].astype(str)
# custom date formatting
fig, ax = plt.subplots()
myFmt = mdates.DateFormatter('%Y-%m-%d')
ax.yaxis.set_major_formatter(myFmt)

df1 = df.groupby(["Created"])["Tag"].count().reset_index()
df2 = df[df["Tag"] == "DISPLAY"].groupby(["Created"])["Tag"].count().reset_index()
plt.plot(df1['Tag'], df1['Created'], label='ALL')
plt.plot(df2['Tag'], df2['Created'], label="DISPLAY")
plt.legend(loc='upper left')
plt.show()

![enter image description here

请注意,由于没有太多数据,我将数据的时间部分砍掉了

df["Created"] = pd.to_datetime(df["Created"],errors='coerce').dt.date

您可以根据需要根据日期,日期时间或日期小时-分钟等对标签进行分类