根据散景

时间:2015-07-01 21:41:18

标签: python pandas plot heatmap bokeh

我正试图从散景中制作一张像这样的热图:

Heat Table from Bokeh

所有代码都在这里:http://bokeh.pydata.org/en/latest/docs/gallery/unemployment.html

我非常接近,但由于某种原因它只是按对角线顺序打印值。

my heat table

我试图以相同的方式格式化我的数据,只是替换它,但它有点复杂。这是我的数据:

from collections import OrderedDict

import numpy as np
import pandas as pd
from bokeh.plotting import ColumnDataSource, figure, show, output_file
from bokeh.models import HoverTool

import pandas.util.testing as tm; tm.N = 3


df = pd.read_csv('MYDATA.csv', usecols=[1, 16]) 
df = df.set_index('recvd_dttm')
df.index = pd.to_datetime(df.index, format='%m/%d/%Y %H:%M')

result = df.groupby([lambda idx: idx.month, 'CompanyName']).agg(len).reset_index()
result.columns = ['Month', 'CompanyName', 'NumberCalls']
pivot_table = result.pivot(index='Month', columns='CompanyName', values='NumberCalls').fillna(0)
s = pivot_table.sum().sort(ascending=False,inplace=False)
pivot_table = pivot_table.ix[:,s.index[:46]]
pivot_table = pivot_table.transpose()
pivot_table.to_csv('pivot_table.csv')


pivot_table = pivot_table.reset_index()
pivot_table['CompanyName'] = [str(x) for x in pivot_table['CompanyName']]
Companies = list(pivot_table['CompanyName'])
months = ["1","2","3","4","5","6","7","8","9","10","11","12"]
pivot_table = pivot_table.set_index('CompanyName')




# this is the colormap from the original plot
colors = [
    "#75968f", "#a5bab7", "#c9d9d3", "#e2e2e2", "#dfccce",
    "#ddb7b1", "#cc7878", "#933b41", "#550b1d"
]

# Set up the data for plotting. We will need to have values for every
# pair of year/month names. Map the rate to a color.
month = []
company = []
color = []
rate = []
for y in pivot_table.index:
    for m in pivot_table.columns:
        month.append(m)
        company.append(y)
        num_calls = pivot_table.loc[y,m]
        rate.append(num_calls)
        color.append(colors[min(int(num_calls)-2, 8)])

source = ColumnDataSource(
    data=dict(months=months, Companies=Companies, color=color, rate=rate)
)

output_file('heatmap.html')

TOOLS = "resize,hover,save,pan,box_zoom,wheel_zoom"

p = figure(title="Customer Calls This Year",
    x_range=Companies, y_range=list(reversed(months)),
    x_axis_location="above", plot_width=1400, plot_height=900,
    toolbar_location="left", tools=TOOLS)

p.rect("Companies", "months", 1, 1, source=source,
    color="color", line_color=None)

p.grid.grid_line_color = None
p.axis.axis_line_color = None
p.axis.major_tick_line_color = None
p.axis.major_label_text_font_size = "10pt"
p.axis.major_label_standoff = 0
p.xaxis.major_label_orientation = np.pi/3

hover = p.select(dict(type=HoverTool))
hover.tooltips = OrderedDict([
    ('Company Name', '@Companies'),
    ('Number of Calls', '@rate'),
])

show(p)      # show the plot

2 个答案:

答案 0 :(得分:1)

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# just following your previous post to simulate your data
np.random.seed(0)
dates = np.random.choice(pd.date_range('2015-01-01 00:00:00', '2015-06-30 00:00:00', freq='1h'), 10000)
company = np.random.choice(['company' + x for x in '1 2 3 4 5'.split()], 10000)
df = pd.DataFrame(dict(recvd_dttm=dates, CompanyName=company)).set_index('recvd_dttm').sort_index()
df['C'] = 1
df.columns = ['CompanyName', '']
result = df.groupby([lambda idx: idx.month, 'CompanyName']).agg({df.columns[1]: sum}).reset_index()
result.columns = ['Month', 'CompanyName', 'counts']
pivot_table = result.pivot(index='CompanyName', columns='Month', values='counts')


x_labels = ['Month'+str(x) for x in pivot_table.columns.values]
y_labels = pivot_table.index.values

fig, ax = plt.subplots()
x = ax.imshow(pivot_table, cmap=plt.cm.winter)
plt.colorbar(mappable=x, ax=ax)
ax.set_xticks(np.arange(len(x_labels)))
ax.set_yticks(np.arange(len(y_labels)))
ax.set_xticklabels(x_labels)
ax.set_yticklabels(y_labels)
ax.set_xlabel('Month')
ax.set_ylabel('Company')
ax.set_title('Customer Calls This Year')

enter image description here

答案 1 :(得分:0)

答案就在这一行:

source = ColumnDataSource(
    data=dict(months=months, Companies=Companies, color=color, rate=rate)
)

应该是:

source = ColumnDataSource(
    data=dict(month=months, company=company, color=color, rate=rate)
)