我在一年的时间跨度上制作堆积条形图,其中x轴是公司名称,y轴是调用次数,堆栈是月份。
我希望能够使这个图表运行一个月的时间跨度,其中堆栈是天,以及一周的时间跨度,其中堆栈是天。我在执行此操作时遇到了麻烦,因为我的代码已经在一年的时间范围内构建。
我的输入是一个看起来像这样的数据框
pivot_table.head(3)
Out[12]:
Month 1 2 3 4 5 6 7 8 9 10 11 12
CompanyName
Customer1 17 30 29 39 15 26 24 12 36 21 18 15
Customer2 4 11 13 22 35 29 15 18 29 31 17 14
Customer3 11 8 25 24 7 15 20 0 21 12 12 17
我的代码到目前为止。
首先我抓住了一年的数据(我会把这个问题改为一个月或一周)
# filter by countries with at least one medal and sort
df['recvd_dttm'] = pd.to_datetime(df['recvd_dttm'])
#Only retrieve data before now (ignore typos that are future dates)
mask = df['recvd_dttm'] <= datetime.datetime.now()
df = df.loc[mask]
# get first and last datetime for final week of data
range_max = df['recvd_dttm'].max()
range_min = range_max - pd.DateOffset(years=1)
# take slice with final week of data
df = df[(df['recvd_dttm'] >= range_min) &
(df['recvd_dttm'] <= range_max)]
然后我创建上面显示的pivot_table。
###########################################################
#Create Dataframe
###########################################################
df = df.set_index('recvd_dttm')
df.index = pd.to_datetime(df.index, format='%m/%d/%Y %H:%M')
result = df.groupby([lambda idx: idx.month, 'CompanyName']).agg(len).reset_index()
result.columns = ['Month', 'CompanyName', 'NumberCalls']
pivot_table = result.pivot(index='Month', columns='CompanyName', values='NumberCalls').fillna(0)
s = pivot_table.sum().sort(ascending=False,inplace=False)
pivot_table = pivot_table.ix[:,s.index[:30]]
pivot_table = pivot_table.transpose()
pivot_table = pivot_table.reset_index()
pivot_table['CompanyName'] = [str(x) for x in pivot_table['CompanyName']]
Companies = list(pivot_table['CompanyName'])
pivot_table = pivot_table.set_index('CompanyName')
pivot_table.to_csv('pivot_table.csv')
然后我使用数据透视表创建一个用于绘图的OrderedDict
###########################################################
#Create OrderedDict for plotting
###########################################################
months = [pivot_table[(m)].astype(float).values for m in range(1, 13)]
names = ["Jan", "Feb", "Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov", "Dec"]
months_dict = OrderedDict(list(zip(names, months)))
###########################################################
#Plot!
###########################################################
palette = brewer["RdYlGn"][8]
hover = HoverTool(
tooltips = [
("Month", "@months"),
("Number of Calls", "@NumberCalls"),
]
)
output_file("stacked_bar.html")
bar = Bar(months_dict, Companies, title="Number of Calls Each Month", palette = palette, legend = "top_right", width = 1200, height=900, stacked=True)
bar.add_tools(hover)
show(bar)
有没有人有关于如何修改此代码的想法,以便它可以缩短时间跨度?这是图表一年的样子
编辑添加了完整的代码。输入看起来像这个例子:
CompanyName recvd_dttm
Company1 6/5/2015 18:28:50 PM
Company2 6/5/2015 14:25:43 PM
Company3 9/10/2015 21:45:12 PM
Company4 6/5/2015 14:30:43 PM
Company5 6/5/2015 14:32:33 PM