具有不同数据框熊猫的分组条形图

时间:2019-09-27 20:35:13

标签: python pandas matplotlib pandas-groupby

我有两个不同的数据框,我想将它们绘制在“分组条形图”视图中的同一图形上。

这是两个非常小的数据框,我想在同一图中绘制它们。 数据帧为:

my dataframes

我想画一个像这个例子的图:

example of what i want

我尝试这样做,只绘制一张图:

fig, ax = plt.subplots()

df1.plot.bar(x='Zona',y='Total_MSP')
df4.plot.bar(x='Zona',y='NumEstCasasFavelas2017',ax=ax)

plt.show()



and i try this too:

fig, ax = plt.subplots()

df1.plot.bar(x='Zona',y='Total_MSP',ax=ax)
df4.plot.bar(x='Zona',y='NumEstCasasFavelas2017',ax=ax)

plt.show()

结果只是图片中单个数据帧的数据,而不是两个数据帧中的两个数据。请注意,只有两个数据帧的标题都出现在同一张图片中,数据仅来自单个隔离的数据帧。

my failures

2 个答案:

答案 0 :(得分:0)

数据:

import pandas as pd

df1 = pd.DataFrame({'Zone': ['C', 'L', 'N', 'O', 'S'],
                    'Total_MSP': [464245, 3764942, 1877505, 1023160, 3179477]})
df2 = pd.DataFrame({'Zone': ['C', 'L', 'N', 'O', 'S'],
                    'CasasFavelas_2017': [463, 4228, 851, 1802, 2060]}) 

合并数据框:

df = pd.merge(df1, df2, on='Zone')
df.set_index('Zone', inplace=True)

Rest Assured Docs (jsonPath)

情节:

  • 必须使用对数刻度来显示Casas
df.plot.bar(logy=True)
plt.show()

enter image description here

答案 1 :(得分:0)

Graphic with four custom color dataframes and caption

import pandas as pd


df12 = pd.DataFrame({'Zone': ['C', 'L', 'N', 'O', 'S'],
                    'Total_MSP': [464245, 3764942, 1877505, 1023160, 3179477]})
df13 = pd.DataFrame({'Zone': ['C', 'L', 'N', 'O', 'S'],
                    'ValorMedioDollar': [1852.27, 1291.53, 1603.44, 2095.90, 1990.10]})
df14 = pd.DataFrame({'Zone': ['C', 'L', 'N', 'O', 'S'],
                    'IDH2010': [0.89, 0.70, 0.79, 0.90, 0.80]})
df15 = pd.DataFrame({'Zone': ['C', 'L', 'N', 'O', 'S'],
                    'QtdNovasCasas': [96,1387, 561, 281, 416]})


df16 = pd.merge(df12, df13, on='Zone')
df16 = pd.merge(df16, df14, on='Zone')
df16 = pd.merge(df16, df15, on='Zone')

fig, ax = plt.subplots(figsize=(50, 20))

#https://xkcd.com/color/rgb/
colors2 = ['#448ee4', '#a9f971','#ceb301','#ffb7ce']


#For all values to be displayed, even though these scales are different, the log scale is used.
df16.plot.bar(x='Zone', logy=True, color=colors2, ax=ax,width=0.5, align = 'center'); 


#legend
#https://stackoverflow.com/questions/19125722/adding-a-legend-to-pyplot-in-matplotlib-in-the-most-simple-manner-possible
plt.gca().legend(('Total Resident Population-2017', 
                  'Median Value of square meter-Dollars US', 
                  'HDI- Human Development Index-2010',
                  'Number of new housing properties-2018'),bbox_to_anchor=(0.87, 0.89) ,fontsize=28)


plt.title('Estimated Resident Population, Average value of square meter, HDI, New housing properties in São Paulo - Brazil',fontsize=40)
plt.xlabel ('Names of the geographical subdivisions of São Paulo',fontsize=40)
plt.ylabel('Log Scale', fontsize=30)

#change the name of month on the x 
ax = plt.gca()
names = ['Zone: Center', 'Zone: East', 'Zone: North', 'Zone: West', 'Zone: South']
ax.set_xticklabels(names,fontsize=40)
x = plt.gca().xaxis



plt.rcParams['ytick.labelsize'] = 30

# rotate the tick labels for the x axis
for item in x.get_ticklabels():
    item.set_rotation(0)    

for spine in plt.gca().spines.values():
    spine.set_visible(False)

# remove all the ticks (both axes), and tick labels on the Y axis
plt.tick_params(top='off', bottom='off', left='off', right='off', labelleft='on', labelbottom='on')



# direct label each bar with Y axis values
for p in ax.patches[0:]:
    plt.gca().text(p.get_x() + p.get_width()/2, p.get_height()+0.01, str(float(p.get_height())), 
                 ha='center', va='baseline', rotation=0 ,color='black', fontsize=25)



plt.show()


fig.savefig('GraficoMultiplo.jpg')