Question

我有一个数据集已过滤为以下数据（样本数据）：

Name Time l
1 1.129 1G-d
1 0.113 1G-a
1 3.374 1B-b
1 3.367 1B-c
1 3.374 1B-d
2 3.355 1B-e
2 3.361 1B-a
3 1.129 1G-a

我在过滤数据帧并将其转换为CSV文件后得到了这些数据：

# Assigns the new data frame to "df" with the data from only three columns
header = ['Names','Time','l']
df = pd.DataFrame(df_2, columns = header)

# Sorts the data frame by column "Names" as integers
df.Names = df.Names.astype(int)
df = df.sort_values(by=['Names'])

# Changes the data to match format after converting it to int
df.Time=df.Time.astype(int)
df.Time = df.Time/1000

csv_file = df.to_csv(index=False, columns=header, sep=" " )

现在，我正在尝试用标记为每个标签列数据/项目绘制线条。我希望将列l作为我的行名称（标签）-每个列都作为新行，将Time作为我的Y轴值，将Names作为我的X轴值。因此，在这种情况下，图形中我将有7条带有以下标签的不同线：1G-d, 1G-a, 1B-b, 1B-c, 1B-d, 1B-e, 1B-a。

到目前为止，我已经完成了以下附加设置，但是我不确定如何绘制线条。

plt.xlim(0, 60)
plt.ylim(0, 18)
plt.legend(loc='best')
plt.show()

我使用了sns.lineplot，它带有色相，所以我不想为标签框命名。另外，在这种情况下，如果不添加样式的新列就无法拥有标记。

我也尝试了ply.plot，但是在那种情况下，我不确定如何增加行数。我只能给出只能创建一行的x和y值。

如果还有其他来源，请在下面告诉我。

谢谢

我想要的最终图形如下所示，但带有标记：

Answer 1

您可以对seaborn的lineplot进行一些调整。由于您的样本还不够长，无法使用它来演示：

# Create data
np.random.seed(2019)
categories = ['1G-d', '1G-a', '1B-b', '1B-c', '1B-d', '1B-e', '1B-a']
df = pd.DataFrame({'Name':np.repeat(range(1,11), 10),
              'Time':np.random.randn(100).cumsum(),
              'l':np.random.choice(categories, 100)
        })

# Plot
sns.lineplot(data=df, x='Name', y='Time', hue='l', style='l', dashes=False,
             markers=True, ci=None, err_style=None)

# Temporarily removing limits based on sample data
#plt.xlim(0, 60)
#plt.ylim(0, 18)

# Remove seaborn legend title & set new title (if desired)
ax = plt.gca()
handles, labels = ax.get_legend_handles_labels()
ax.legend(handles=handles[1:], labels=labels[1:], title='New Title', loc='best')

plt.show()

要应用标记，必须指定一个style变量。这可以与hue相同。
您可能希望删除dashes，ci和err_style
要删除原始的图例标题，可以获取handles和labels，然后重新添加图例而无需第一个句柄和标签。您还可以在此处指定位置，并根据需要设置新标题（或只需删除title=...即可删除标题）。

每个评论的编辑次数：

通过以下方式相当容易地将数据过滤到仅一个级别类别的子集中：

categories = ['1G-d', '1G-a', '1B-b', '1B-c', '1B-d', '1B-e', '1B-a']
df = df.loc[df['l'].isin(categories)]

如果级别太多，

markers=True将失败。如果您只对出于美学目的标记点感兴趣，则可以简单地将一个标记乘以您感兴趣的类别数（已创建该标记，以将数据过滤到感兴趣的类别）：markers='o'*len(categories)。

或者，您可以指定自定义词典以传递给markers参数：

points = ['o', '*', 'v', '^']
mult = len(categories) // len(points) + (len(categories) % len(points) > 0)
markers = {key:value for (key, value) 
           in zip(categories, points * mult)}

这将返回类别-点组合的字典，循环遍历指定的标记点，直到categories中的每个项目都具有点样式。

如何用pandas数据框中的列（从第3列值开始）标记折线图？

1 个答案: