我试图在我的数据集上使用Seaborn的PairGrid函数创建一个比较图。我的数据集有6列,我正在尝试使用scatter()函数在我应用于整个数据帧的PairGrid函数的.map_upper段中进行绘制。这是我的数据框对象的快速发展; 'year'对象设置为数据框的索引
以下是我的数据框comp_pct_chg_df的数据类型:
年份对象 阿姆斯特丹float64 巴塞罗那float64 金士顿float64 米兰float64 费城float64 全局float64 dtype:对象
这是我下面的错误代码:
# Creating a comparison plot (.PairGrid()) of all my cities' and global data's average percent change in temperature
# Set up my figure by naming it 'pct_chg_yrly_fig', then call PairGrid on the DataFrame
pct_chg_yrly_fig = sns.PairGrid(comp_pct_chg_df.dropna())
# Using map_upper we can specify what the upper triangle will look like.
pct_chg_yrly_fig.map_upper(plt.scatter,color='purple')
# We can also define the lower triangle in the figure, including the plot type (KDE) or the color map (BluePurple)
pct_chg_yrly_fig.map_lower(sns.kdeplot,cmap='cool_d')
# Finally we'll define the diagonal as a series of histogram plots of the yearly average percent change in temperature
pct_chg_yrly_fig.map_diag(plt.hist,histtype='step',linewidth=3,bins=30)
# Adding a legend
pct_chg_yrly_fig.add_legend()
有些可视化效果确实可以绘制出来,例如我使用的.map_lower()函数,效果非常好。但是,我想以不同的颜色绘制每个城市的颜色,以用于我使用的.map_upper()函数中的散点图。现在它是单色的,很难分辨哪些数据点属于哪个城市。最后,我的.map_diag()根本没有绘制。我不知道我在做什么错。我已经评估了我收到的ValueError信息(如下所示),并试图操纵数十个**假货,标签和颜色,但无济于事。帮助将不胜感激。
Here is the ValueError msg I'm receiving:
ValueError Traceback (most recent call last)
<ipython-input-38-3fcf1b69d4ef> in <module>()
11
12 # Finally we'll define the diagonal as a series of histogram plots of the yearly average percent change in temperature
---> 13 pct_chg_yrly_fig.map_diag(plt.hist,histtype='step',linewidth=3,bins=30)
14
15 # Adding a legend
~/anaconda3/lib/python3.6/site-packages/seaborn/axisgrid.py in map_diag(self, func, **kwargs)
1361
1362 if "histtype" in kwargs:
-> 1363 func(vals, color=color, **kwargs)
1364 else:
1365 func(vals, color=color, histtype="barstacked", **kwargs)
~/anaconda3/lib/python3.6/site-packages/matplotlib/pyplot.py in hist(x, bins, range, density, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label, stacked, normed, hold, data, **kwargs)
3023 histtype=histtype, align=align, orientation=orientation,
3024 rwidth=rwidth, log=log, color=color, label=label,
-> 3025 stacked=stacked, normed=normed, data=data, **kwargs)
3026 finally:
3027 ax._hold = washold
~/anaconda3/lib/python3.6/site-packages/matplotlib/__init__.py in inner(ax, *args, **kwargs)
1715 warnings.warn(msg % (label_namer, func.__name__),
1716 RuntimeWarning, stacklevel=2)
-> 1717 return func(ax, *args, **kwargs)
1718 pre_doc = inner.__doc__
1719 if pre_doc is None:
~/anaconda3/lib/python3.6/site-packages/matplotlib/axes/_axes.py in hist(***failed resolving arguments***)
6137 color = mcolors.to_rgba_array(color)
6138 if len(color) != nx:
-> 6139 raise ValueError("color kwarg must have one color per dataset")
6140
6141 # If bins are not specified either explicitly or via range,
ValueError: color kwarg must have one color per dataset
我还注意到我的索引(年份对象)在我的PairGrid的左上角绘制。看起来像是一串垂直相邻的垂直线。不确定为什么要绘制,但可能是因为值(1743年-2015年)以“ .0”结尾吗?我将数据框放在一起时注意到了这一点(我不知道如何删除它……这里是Python newb),所以我将year列的数据类型从float64更改为string并将其设置为索引。我以为这样做即使值是数字,数据类型也设置为字符串,所以无法对它们进行计算吗?我在这里想念什么吗?