背景

Question

背景

我正在阅读使用Python进行机器学习的简介，并在Chapter 2中尝试了In [45]的可视化。首先，我使用不同的LogisticRegression参数为Winsconsin癌症数据集拟合了3个C分类器。然后，对于每个分类器，我绘制了每个特征的系数大小。

%matplotlib inline
from sklearn.datasets import load_breast_cancer
from sklearn.linear_model import LogisticRegression
from matplotlib import pyplot as plt

cancer = load_breast_cancer()

for C, marker in [(0.01, 'o'), (1., '^'), (100., 'v')]:
    logreg = LogisticRegression(C=C).fit(cancer.data, cancer.target)
    plt.plot(logreg.coef_[0], marker, label=f"C={C}")
plt.xticks(range(cancer.data.shape[1]), cancer.feature_names, rotation=90)
plt.hlines(0, 0, cancer.data.shape[1])
plt.legend()

在这种情况下，我更喜欢使用标记而不是使用标记。我希望得到一个图表，如：

我通过以下工作流程实现了这一目标。

步骤1：创建`DataFrame`保持系数幅度为行

%matplotlib inline
from sklearn.datasets import load_breast_cancer
from sklearn.linear_model import LogisticRegression
import pandas as pd

cancer = load_breast_cancer()

df = pd.DataFrame(columns=cancer.feature_names)
for C in [0.01, 1., 100.]:
    logreg = LogisticRegression(C=C).fit(cancer.data, cancer.target)
    df.loc[f"C={C}"] = logreg.coef_[0]

df

第2步：将`DataFrame`转换为`seaborn.barplot` - 适用的表格

import itertools

df_bar = pd.DataFrame(columns=['C', 'Feature', 'Coefficient magnitude'])
for C, feature in itertools.product(df.index, df.columns):
    magnitude = df.at[C, feature]
    df_bar = df_bar.append({'C': C, 'Feature': feature, 'Coefficient magnitude': magnitude},
                           ignore_index=True)

df_bar.head()

第3步：按`seaborn.barplot`

绘制

from matplotlib import pyplot as plt
import seaborn as sns

plt.figure(figsize=(12,8))
sns.barplot(x='Feature', y='Coefficient magnitude', hue='C', data=df_bar)
plt.xticks(rotation=90)

这产生了我想要的图表。

问题

我认为第2步很乏味。我可以直接在步骤1中从df制作条形图，还是通过单行制作df_bar？或者是否有一个更优雅的工作流程来获得条形图？

Answer 1

Pandas绘制了逐列分组的条形图。因此应该可以做

df = df.transpose()
df.plot(kind="bar")

不使用seaborn。

如果因任何原因需要使用seaborn，可以通过pandas.melt简化问题的第2步。

df_bar = df.reset_index().melt(id_vars=["index"])
sns.barplot(x="variable", y="value", hue="index", data=df_bar)

Barplot按行索引分组

背景

步骤1：创建`DataFrame`保持系数幅度为行

第2步：将`DataFrame`转换为`seaborn.barplot` - 适用的表格

第3步：按`seaborn.barplot`

问题

1 个答案:

Barplot按行索引分组

背景

步骤1：创建DataFrame保持系数幅度为行

第2步：将DataFrame转换为seaborn.barplot - 适用的表格

第3步：按seaborn.barplot

问题

1 个答案:

步骤1：创建`DataFrame`保持系数幅度为行

第2步：将`DataFrame`转换为`seaborn.barplot` - 适用的表格

第3步：按`seaborn.barplot`