大熊猫按类别和点大小分散图

时间:2018-01-30 02:41:40

标签: python pandas matplotlib scatter

所以我有想法使用单个Pandas图来显示两个不同的数据,一个在Y轴,另一个作为点大小,但我想对它们进行分类,即X轴不是数值但是一些类别。我将首先介绍我的两个示例数据帧:

earnings:
       DayOfWeek  Hotel  Bar  Pool
    0     Sunday     41   32    15
    1     Monday     45   38    24
    2    Tuesday     42   32    27
    3  Wednesday     45   37    23
    4   Thursday     47   34    26
    5     Friday     43   30    19
    6   Saturday     48   30    28

tips:
   DayOfWeek  Hotel  Bar  Pool
0     Sunday      7    8     6
1     Monday      9    7     5
2    Tuesday      5    4     1
3  Wednesday      8    6     7
4   Thursday      4    5    10
5     Friday      3    1     1
6   Saturday     10    2     6

收入是酒店,酒吧和游泳池的总收入,而小费是同一地点的平均小费值。我会发布我的代码作为答案,请随意改进/更新。

干杯!

另见: Customizing Plot Legends

2 个答案:

答案 0 :(得分:2)

这是一种适合图形语法的情节。

import pandas as pd
from plotnine import *

# Create data
s1 = StringIO("""
       DayOfWeek  Hotel  Bar  Pool
    0     Sunday     41   32    15
    1     Monday     45   38    24
    2    Tuesday     42   32    27
    3  Wednesday     45   37    23
    4   Thursday     47   34    26
    5     Friday     43   30    19
    6   Saturday     48   30    28

""")

s2 = StringIO("""
   DayOfWeek  Hotel  Bar  Pool
0     Sunday      7    8     6
1     Monday      9    7     5
2    Tuesday      5    4     1
3  Wednesday      8    6     7
4   Thursday      4    5    10
5     Friday      3    1     1
6   Saturday     10    2     6
""")

# Read data
earnings = pd.read_csv(s1, sep="\s+")
tips = pd.read_csv(s2, sep="\s+")

# Make tidy data
kwargs = dict(value_vars=['Hotel', 'Bar', 'Pool'], id_vars=['DayOfWeek'], var_name='location')
days = ['Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday']
earnings = pd.melt(earnings, value_name='earnings', **kwargs)
tips = pd.melt(tips,  value_name='tip', **kwargs)
df = pd.merge(earnings, tips, on=['DayOfWeek', 'location'])
df['DayOfWeek'] = pd.Categorical(df['DayOfWeek'], categories=days, ordered=True)

# Create plot
p = (ggplot(df)
     + geom_point(aes('DayOfWeek', 'earnings', color='location', size='tip'))
    )
print(p)

Result Plot

答案 1 :(得分:1)

以下是代码:

volatile

Resulting figure