seaborn regplot删除数据点的颜色

时间:2017-08-30 08:22:54

标签: python matplotlib regression seaborn

我正在分析Iris dataset并在花瓣宽度和花瓣长度之间做了一个散点图。为了制作情节,我使用了这段代码:

# First, we'll import pandas, a data processing and CSV file I/O library
import pandas as pd
# We'll also import seaborn, a Python graphing library
import warnings # current version of seaborn generates a bunch of warnings that we'll ignore
warnings.filterwarnings("ignore")
import seaborn as sns
import matplotlib.pyplot as plt
import numpy
sns.set(style="dark", color_codes=True)

# Next, we'll load the Iris flower dataset, which is in the "../input/" directory
iris = pd.read_csv("Iris.csv") # the iris dataset is now a Pandas DataFrame

# Let's see what's in the iris data - Jupyter notebooks print the result of the last thing you do
print(iris.head(10))

# Press shift+enter to execute this cell
sns.FacetGrid(iris, hue="Species", size=10) \
   .map(plt.scatter, "PetalLengthCm", "PetalWidthCm") \
   .add_legend()

enter image description here

之后我绘制了一条回归线,但在绘制该线之后,颜色并不清晰可见。我试图改变回归线的颜色但这并没有帮助。如何在不丢失不同物种颜色的情况下绘制回归线?

制作包含回归线的图的代码是:

sns.FacetGrid(iris, hue="Species", size=10) \
   .map(plt.scatter, "PetalLengthCm", "PetalWidthCm") \
   .add_legend()
sns.regplot(x="PetalLengthCm", y="PetalWidthCm", data=iris)

petal_length_array = iris["PetalLengthCm"]
petal_width_array = iris["PetalWidthCm"]

r_petal = numpy.corrcoef(petal_length_array, petal_width_array) # bereken de correlatie

print ("Correlation is : " + str(r_petal[0][1]))

enter image description here

2 个答案:

答案 0 :(得分:2)

你的问题是sns.regplot()在具有不同颜色的点之上绘制所有相同颜色的点。

为避免这种情况,请尝试调用regplot(..., scatter=False)以防止绘制单个数据点。 Check the documentation for regplot.

答案 1 :(得分:0)

如果您对多条回归线感到满意,您可以拆分数据并过度绘制...

iris = sns.load_dataset("iris")

fig, ax = plt.subplots() 
colors = ['darkorange', 'royalblue', '#555555']
markers = ['.', '+', 'x']

for i, value in enumerate(iris.species.unique()):
    ax = sns.regplot(x="petal_length", y="petal_width", ax=ax,
                     color=colors[i],
                     marker=markers[i], 
                     data=iris[iris.species == value],
                     label=value)

ax.legend(loc='best') 
display(fig) 
plt.close('all')

Iris plot with separate regressions for species