我有一个两列数据集,一个是日期,从1990年到2015年,另一个是电力输出。我想绘制一定年份的功率输出,整个数据集的日平均电输出和+ -1标准误差,这意味着所有在一个线图上。然而,就目前而言,我得到一个平均值的图,然后是每年1个图(即8个单独的图)。我试图在熊猫中做到这一点,但无论是NumPy还是Pandas的帮助都会很棒。
我的数据集和当前代码如下:
# Dummy data since the actual file is very large.
dates = pd.date_range(start='01-01-1990',end='12-31-2015',freq='D')
vols = np.random.random_integers(0,60000,9496)
full = np.array([dates,vols]).T
yrs = [1990,1994,1998,2002,2006,2010,2014]
mons = ['Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec']
data = pd.DataFrame(full,names=['date','elec'])
data['yr'] = pd.DatetimeIndex(data['date']).year
data['mon'] = pd.DatetimeIndex(data['date']).month
data['day'] = pd.DatetimeIndex(data['date']).day
mean = data.groupby(['mon','day']).mean().reset_index()
std = data.groupby(['mon','day']).std().reset_index()
power = data[data['yr'].isin(yrs)]
pwryear = power.groupby('yr')
fig, ax1 = plt.subplots(figsize = (14,5))
ax1.plot(mean.elec,'k-',lw=4)
pwryear.plot()
ax1.set_title('Merrimack River floods at Lowell, MA')
ax1.set_ylabel('Discharge [m$^{3}$ s$^{-1}$]')
ax1.set_xticks([0,30,59,90,120,151,181,212,243,273,304,334])
ax1.set_xticklabels(mons)
ax1.set_xlim(0,365)
ax1.grid()
ax1.legend(loc=2)
答案 0 :(得分:0)
我很确定pandas graphing只是Matplotlib的一个包装器。如果是这种情况,您可以将系列作为x坐标和y坐标的列表传递,(交替)在同一个图上渲染多个系列。
http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.plot
答案 1 :(得分:0)
尝试将>>> {frozenset((key1, key2)): 0 for key2 in items.keys() for key1 in items.keys() if key1 != key2}
{frozenset({2, 4}): 0, frozenset({1, 2}): 0, frozenset({1, 4}): 0}
替换为:
pwryear = power.groupby('yr')
现在,当你运行# create a column that includes month and day, but not year
# this will be the x-axis
data['mon_day'] = data.date.apply(lambda x: x.strftime('%m-%d'))
# transform your dataframe to have one column per year
pwryear = power[['mon_day', 'elec', 'yr']].set_index(['mon_day', 'yr']).unstack(1)
pwryear.columns = pwryear.columns.droplevel(0)
时,你每年会得到一行。
要获取std的阴影,您需要更改
pwryear.plot()
到
mean = data.groupby(['mon','day']).mean().reset_index()
std = data.groupby(['mon','day']).std().reset_index()
然后,做:
mean = data.groupby('mon_day').mean()
std = data.groupby('mon_day').std()
希望有所帮助!