如何更改海洋散点图中离群值的颜色?

时间:2019-07-11 07:39:47

标签: python seaborn scatter-plot outliers

我想通过将异常值更改为其他颜色来识别异常值,以便在除去异常值后,散点图中的变化更加清晰。

# TotalBsmtSF: Total square feet of basement area

fig = plt.figure(figsize=(16, 8))

ax1 = fig.add_subplot(211)
b = sns.scatterplot(x = 'TotalBsmtSF', y = 'SalePrice', data = df, ax=ax1,)
plt.title ('Total square feet of basement area VS SalePrice (With Outliers)', fontsize=13)
plt.tight_layout()

# Removing houses with total basement area which is more than 3000 square feet
df = df.drop(df[(df['TotalBsmtSF']>3000) & (df['SalePrice']>=160000)].index)
# print(df['TotalBsmtSF'].head(450))
ax2 = fig.add_subplot(212)
b = sns.scatterplot(x = 'TotalBsmtSF', y = 'SalePrice', data = df, ax=ax2,)
plt.title ('Total square feet of basement area VS SalePrice (Outliers Removed)', fontsize=13)
plt.tight_layout()

plt.close(2)
plt.close(3)
plt.tight_layout()

1 个答案:

答案 0 :(得分:1)

Seaborn允许您基于分类或数字数据change the color个标记。因此,您可以创建一个新列来定义数据点是否为异常值,然后在seaborn中调用hue参数。这些就是要在您的代码中添加或更改的行

df['outlier'] = np.where(df['TotalBsmtSF']>3000) & (df['SalePrice']>=160000), 'yes', 'no')
b = sns.scatterplot(x = 'TotalBsmtSF', y = 'SalePrice', data = df, ax=ax1, hue="outlier")

我认为这应该可行,但是由于没有可用的数据,我无法确认