我有一个pandas数据框,从具有以下结构的.csv文件读入:
Date, Latitude, Longitude, Brand, Pump, AKI, Trip Miles, Total Miles, Gallons, MPG, PPG, Total, Tires, MPG-D,
11/03/2013, 40° 1.729', -105° 15.516', Boulder Gas, 2, 87, 134.3, 134.3, 6.563, 20.46, 3.319, 21.78, Stock, ,
11/17/2013, 40° 1.729', -105° 15.516', Boulder Gas, 2, 87, 161.8, 296.0, 7.467, 21.67, 3.279, 24.48, Stock, ,
11/27/2013, 40° 0.872', -105° 12.775', Buffalo Gas, 6, 87, 180.8, 477.0, 8.096, 22.33, 3.359, 27.19, Stock, ,
12/07/2013, 40° 1.729', -105° 15.516', Boulder Gas, 6, 87, 265.1, 742.0, 12.073, 21.96, 3.179, 38.38, Stock, ,
12/11/2013, 40° 2.170', -105° 15.522', Circle K, 4, 87, 240.9, 983.0, 9.868, 24.41, 3.179, 31.37, Stock, ,
12/15/2013, 40° 8.995', -105° 7.876', Shell, 3, 87, 188.7, 1172, 8.596, 21.95, 3.059, 26.30, , ,
12/21/2013, 40° 1.770', -105° 15.481', Conoco, 3, 87, 113.8, 1286, 5.517, 20.62, 3.139, 17.32, Winter, ,
01/09/2014, 40° 1.729', -105° 15.516', Boulder Gas, 2, 87, 139.5, 1426, 7.181, 19.42, 3.279, 23.55, Winter, 21.3,
01/13/2013, 40° 1.770', -105° 15.481', Conoco, 7, 87, 260.8, 1688, 11.177, 23.33, 3.239, 36.20, Winter, 25.5,
01/18/2014, 40° 1.729', -105° 15.516', Boulder Gas, 2, 87, 102.0, 1790, 4.401, 23.18, 3.239, 14.26, Winter, 25.5,
02/02/2014, 39° 59.132', -105° 14.962', King Soopers, 5, 87, 175.3, 1965, 8.436, 20.78, 3.019, 25.47, Winter, 24.0,
02/03/2014, 40° 1.770', -105° 15.481', Conoco, 3, 87, 249.9, 2215, 10.452, 23.91, 3.219, 33.64, Winter, 25.2,
02/08/2014, 40° 2.170', -105° 15.522', Circle K, 7, 87, 186.4, 2402, 8.565, 21.76, 3.239, 27.74, Winter, 24.3,
02/13/2014, 40° 1.729', -105° 15.516', Boulder Gas, 8, 87, 79.6, 2481, 4.125, 19.30, 3.439, 14.19, Winter, 21.3,
03/06/2014, 40.014460, -105.225034, Conoco, 5, 87, 172.4, 2654, 8.618, 20.00, 3.779, 32.57, Winter, 21.9,
03/09/2014, 40.029498, -105.258117, Conoco, 6, 87, 230.4, 2884, 9.016, 25.55, 3.759, 33.89, Winter, 27.3,
03/17/2014, 40.036236, -105.258763, Conoco, 6, 87, 130.1, 3014, 5.368, 24.24, 3.719, 19.96, Winter, 25.8,
03/24/2014, 40.036236, -105.258763, Conoco, 1, 87, 282.3, 3297, 11.540, 24.46, 3.719, 42.92, Winter, 27.3,
我想生成一个图,其中x轴是日期,左边的y轴是英里/加仑,右边的y轴是英里。在这个图中,我想以一种颜色显示'MPG'列的时间序列,以另一种颜色显示'MPG-D'的时间序列,以及以第三种颜色显示'Trip Miles'列的条形图。
我一直在尝试关注http://pandas.pydata.org/pandas-docs/stable/visualization.html并使用下面的代码,但它会生成一个条形图和两个时间序列图,其中所有内容都在同一轴上,并且不会显示y标签。
%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv('mpg.csv', skipinitialspace=True,index_col='Date')
plt.figure()
ax = data['Trip Miles'].plot(kind='bar',secondary_y=['Trip Miles'])
ax.right_ax.set_ylabel('Miles')
ax.set_ylabel('Miles/Gallon')
data['MPG'].plot()
data['MPG-D'].plot()
答案 0 :(得分:8)
您需要更明确地指定轴。试试这样:
%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt
fig, tsax = plt.subplots()
barax = tsax.twinx()
data = pd.read_csv('mpg.csv', skipinitialspace=True,index_col='Date')
data['Trip Miles'].plot(kind='bar', ax=barax)
barax.set_ylabel('Miles')
tsax.set_ylabel('Miles/Gallon')
data['MPG'].plot(ax=tsax)
data['MPG-D'].plot(ax=tsax)
这里的一个大问题是,pandas bar图和线图以完全不同的方式格式化x轴。具体而言,条形图尝试使用每个条形的刻度和标签进行定性比例。但在这里,你似乎对获得更像典型时间序列的格式感兴趣。
所以我建议您忘记双轴图表。相反,只需绘制两个完全独立的轴。像这样:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.gridspec as mgrid
import pandas as pd
fig = plt.figure(figsize=(12,5))
grid = mgrid.GridSpec(nrows=2, ncols=1, height_ratios=[2, 1])
barax = fig.add_subplot(grid[0])
tsax = fig.add_subplot(grid[1])
data = pd.DataFrame(np.random.randn(10,3), columns=list('ABC'), index=pd.DatetimeIndex(freq='1M', start='2012-01-01', periods=10))
data['A'] **= 2
data['A'].plot(ax=barax, style='o--')
barax.set_ylabel('Miles')
tsax.set_ylabel('Miles/Gallon')
barax.xaxis.tick_top()
data['B'].plot(ax=tsax)
data['C'].plot(ax=tsax)
fig.tight_layout()
这给了我:
但是,如果真的需要条形图或者确实想要在同一个双x轴上的所有内容,那么你必须使用matplotlib的API进行绘图:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.gridspec as mgrid
import pandas as pd
fig, tsax = plt.subplots(figsize=(12,5))
barax = tsax.twinx()
data = pd.DataFrame(np.random.randn(10,3), columns=list('ABC'), index=pd.DatetimeIndex(freq='1M', start='2012-01-01', periods=10))
data['A'] **= 2
# the `width` is specified in days -- adjust for your data
barax.bar(data.index, data['A'], width=5, facecolor='indianred')
barax.set_ylabel('Miles')
tsax.set_ylabel('Miles/Gallon')
barax.xaxis.tick_top()
fig.tight_layout()
tsax.plot(data.index, data['B'])
tsax.plot(data.index, data['C'])
然后给了我