根据Altair中的alt.Color字段进行排序

时间:2020-03-10 22:51:36

标签: python altair

我正在尝试根据它所属的组对水平条形图进行排序。我已经包括了数据框,我认为可以帮助我进行按组排序的代码和图像。目前,该图表是根据物种列按字母顺序排序的,但是我希望按组对其进行排序,以便所有“不良”都在一起,类似地,所有“良品”都在一起。理想情况下,我想更进一步,以便随后按“ LDA得分”的值对货物和货物进行分类,但这是下一步。

Dataframe:
Unnamed: 0,Species,Unknown,group,LDA Score,p value
11,a,3.474929757,bad,3.07502591,5.67e-05
16,b,3.109308852,bad,2.739744898,0.000651725
31,c,3.16979865,bad,2.697247855,0.03310557
38,d,0.06730106400000001,bad,2.347746497,0.013009626000000002
56,e,2.788383183,good,2.223874347,0.0027407140000000004
65,f,2.644346144,bad,2.311106698,0.00541244
67,g,3.626001112,good,2.980960068,0.038597163
74,h,3.132399759,good,2.849798377,0.007021518000000001
117,i,3.192113412,good,2.861299028,8.19e-06
124,j,0.6140430960000001,bad,2.221483531,0.0022149739999999998
147,k,2.873671544,bad,2.390164757,0.002270102
184,l,3.003479213,bad,2.667274876,0.008129727
188,m,2.46344998,good,2.182085465,0.001657861
256,n,0.048663767,bad,2.952260299,0.013009626000000002
285,o,2.783848855,good,2.387345098,0.00092491
286,p,3.636219,good,3.094047,0.001584756

代码:

bars = alt.Chart(df).mark_bar().encode(
    alt.X('LDA Score:Q'),
    alt.Y("Species:N"),
    alt.Color('group:N', sort=alt.EncodingSortField(field="Clinical group", op='distinct', order='ascending'))
)

bars

结果图: enter image description here

1 个答案:

答案 0 :(得分:2)

两件事:

  • 如果要对y轴进行排序,则应将排序表达式放入y编码中。在上方,您正在对图例中的颜色标签进行排序。
  • 在Vega-Lite中按字段排序仅适用于数字数据(编辑:这是不正确的;请参见下文),因此您可以使用计算转换将条目映射到要排序的数字。

结果可能看起来像这样:

alt.Chart(df).transform_calculate(
    order='datum.group == "bad" ? 0 : 1'  
).mark_bar().encode(
    alt.X('LDA Score:Q'),
    alt.Y("Species:N", sort=alt.SortField('order')),
    alt.Color('group:N')
)

enter image description here


编辑:事实证明,按group进行排序失败的原因是排序字段的默认操作为sum,仅适用于定量数据。如果选择其他操作,则可以直接对名义数据进行排序。例如,这显示了正确的输出:

alt.Chart(df).mark_bar().encode(
    alt.X('LDA Score:Q'),
    alt.Y("Species:N", sort=alt.EncodingSortField('group', op='min')),
    alt.Color('group:N')
)

有关更多信息,请参见vega/vega-lite#6064