Question

我正在使用plotnine制作多行绘图。熊猫数据框如下所示：

df

     TIMESTAMP              TEMP    RANK   TIME
0    2011-06-01 00:00:00    24.3    1.0    0.000000
1    2011-06-01 00:05:00    24.5    1.0    0.083333
2    2011-06-01 00:10:00    24.2    1.0    0.166667
3    2011-06-01 00:15:00    24.1    1.0    0.250000
4    2011-06-01 00:20:00    24.2    1.0    0.333333
5    2011-06-01 00:25:00    24.3    1.0    0.416667
6    2011-06-01 00:30:00    24.4    1.0    0.500000
7    2011-06-01 00:35:00    24.5    1.0    0.583333
8    2011-06-01 00:40:00    24.4    1.0    0.666667
9    2011-06-01 00:45:00    24.4    1.0    0.750000
10    2011-07-01 00:00:00    24.3    2.0    0.000000
11    2011-07-01 00:05:00    24.5    2.0    0.083333
12    2011-07-01 00:10:00    24.2    2.0    0.166667
13    2011-07-01 00:15:00    24.1    2.0    0.250000
14    2011-07-01 00:20:00    24.2    2.0    0.333333
15    2011-07-01 00:00:00    24.3    2.0    0.000000
16    2011-08-01 00:05:00    24.5    3.0    0.083333
17    2011-08-01 00:10:00    24.2    3.0    0.166667
18    2011-08-01 00:15:00    24.1    3.0    0.250000
19    2011-08-01 00:20:00    24.2    3.0    0.333333
20    2011-08-01 00:25:00    24.4    3.0    0.416667

我想在x轴上绘制TIME，在y轴上绘制TEMP。我也想根据等级画不同的线。

这是我的做法：

ggplot()
+ geom_line(aes(x='TIME', y='TEMP', color='RANK', group='RANK'), data=df[df['RANK']<11])
+ scale_x_continuous(breaks=[4*x for x in range(7)])

如何更改右侧行列的图例？我希望它是离散的，以便每种颜色都代表一个等级/日期。

我不知道该如何更改。我尝试使用scale_fill_continuous或scale_fill_discrete，但未成功：

ggplot()
+ geom_line(aes(x='TIME', y='TEMP', color='RANK', group='RANK'), data=df[df['RANK']<11])
+ scale_x_continuous(breaks=[4*x for x in range(7)])
+ scale_fill_discrete(breaks=[x for x in range(1, 11)])

我得到UserWarning: Cannot generate legend for the 'fill' aesthetic. Make sure you have mapped a variable to it "variable to it".format(output))

如果我使用scale_fill_continuous(breaks=[x for x in range(1, 11)])，也会遇到相同的错误。

我也尝试过scale_fill_manual(values=['blue', 'red', 'green', 'orange', 'purple', 'pink', 'black', 'yellow', 'cyan', 'magenta'])，但不确定如何使它工作。

编辑＃1

我现在了解这是因为我的RANK变量是float64类型，它需要是其他一些数据类型，但是问题是哪个？因为如果我将其转换为分类的，则会收到错误：

TypeError: Unordered Categoricals can only compare equality or not

Answer 1

好的，所以我想出了解决问题的办法。正如问题中指出的那样，我用来对geom_line（）进行分组的属性是float64。这就是为什么分组图例是连续的。

因此，要解决此问题，我做了以下事情：

d.RANK = d.RANK.astype('category', ordered=True)

这也解决了编辑1中指出的错误。

d.RANK = d.RANK.astype('str')也可以。

如何将ggplot右侧的图例从连续更改为离散？

1 个答案: