我试图像Bokeh中的股票数据一样获得一个情节,如链接http://bokeh.pydata.org/en/latest/docs/gallery/stocks.html
2004-01-05,00:00:00,01:00:00,Mon,20504,792
2004-01-05,01:00:00,02:00:00,Mon,16553,783
2004-01-05,02:00:00,03:00:00,Mon,18944,790
2004-01-05,03:00:00,04:00:00,Mon,17534,750
2004-01-06,00:00:00,01:00:00,Tue,17262,747
2004-01-06,01:00:00,02:00:00,Tue,19072,777
2004-01-06,02:00:00,03:00:00,Tue,18275,785
我想使用第2列:startTime和5:count,我想按列day
进行分组,并在相应的小时内对counts
求和。
代码:不提供输出
import numpy as np
import pandas as pd
#from bokeh.layouts import gridplot
from bokeh.plotting import figure, show, output_file
data = pd.read_csv('one_hour.csv')
data.column = ['date', 'startTime', 'endTime', 'day', 'count', 'unique']
p1 = figure(x_axis_type='startTime', y_axis_type='count', title="counts per hour")
p1.grid.grid_line_alpha=0.3
p1.xaxis.axis_label = 'startTime'
p1.yaxis.axis_label = 'count'
output_file("count.html", title="time_graph.py")
show(gridplot([[p1]], plot_width=400, plot_height=400)) # open a browser
读取列和绘图没有任何问题,但是对列数据应用group by和sum操作是我无法执行的。
感谢您的帮助,谢谢!
答案 0 :(得分:1)
听起来这就是你所需要的:
data.groupby('startTime')['count'].sum()
输出:
00:00:00 37766
01:00:00 35625
02:00:00 37219
03:00:00 17534