使用熊猫可视化工具绘制多个子图

时间:2018-09-12 22:14:28

标签: python pandas matplotlib

我正在使用kaggle中的IGN评论数据集,并且试图通过每个nintendo平台获取给定发布日期的x周天的频率图,这是代码

import pandas as pd
df = pd.read_csv("ign.csv")
datetime_df = pd.DataFrame({'year': df["release_year"],
                   'month': df["release_month"],
                   'day': df["release_day"]})
df["date"] = pd.to_datetime(datetime_df)

df["week_day"] = df["date"].apply(lambda x : x.weekday_name)

nintendo = ['Wii','Nintendo DS','Nintendo 3DS','Nintendo DS',
            'Game Boy', 'Game Boy Color','Nintendo 64DD','Game Boy Advance',
            'New Nintendo 3DS','GameCube','Nintendo DSi','Super NES']

base_nintendo = df[df["platform"].isin(nintendo)]

data = base_nintendo.groupby(["platform","week_day"]).size()

data =data.unstack().fillna(0).stack()

data

输出:

platform          week_day 
Game Boy          Friday         5.0
                  Monday         5.0
                  Saturday       0.0
                  Sunday         0.0
                  Thursday       0.0
                  Tuesday        4.0
                  Wednesday      8.0
Game Boy Advance  Friday       131.0
                  Monday       109.0
                  Saturday       0.0
                  Sunday         1.0
                  Thursday     153.0
                  Tuesday      123.0
                  Wednesday    106.0
Game Boy Color    Friday        89.0
                  Monday        43.0
                  Saturday       1.0
                  Sunday         1.0
                  Thursday      55.0
                  Tuesday       78.0
                  Wednesday     89.0
GameCube          Friday        99.0
                  Monday       100.0
                  Saturday       3.0
                  Sunday         0.0
                  Thursday      83.0
                  Tuesday      124.0
                  Wednesday    100.0

我尝试做:

data.groupby("platform").plot("barh")

但这只会给我最后一个平台(wii):

enter image description here

2 个答案:

答案 0 :(得分:1)

是否注意到在情节上方,您为每个组(例如Super NES ....)获得了一行?这些是您其他图的matplotlib.AxesSubplot对象。

groupby.plot实际上为每个组返回一个matplotlib.AxesSubplot对象。另一方面,ipython notebook只显示您的最后一个情节。

因此,解决方案是:将您的data.groupby("platform").plot("barh")更改为my_axes = data.groupby("platform").plot("barh"),然后逐个处理,例如

for ax in my_axes:
    ax.savefig(filename)

或者,您可以执行以下操作:

gp = data.groupby("platform")
f, axes = plt.subplots(5, 5)  # or any other large enough subplot grid
for k, ax in zip(gp.groups, axes.ravel()):
    gp.get_group(k).plot('barh', ax=ax)

答案 1 :(得分:1)

一种解决方案是使用seaborn并绘制barh

data = data.unstack().fillna(0).stack()
data = data.reset_index().rename(columns={0:'value'})

import seaborn as sns
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(10,7))
sns.barplot(y='platform',x='value', hue='week_day', data=data, orient='h')
plt.show()