pandas分类错误:"无法在具有新类别的分类上设置,首先设置类别"

时间:2018-03-09 15:47:59

标签: python pandas categorical-data

我在pandas中有以下df数据框:

    weekday  venta_total_cy
0   Viernes    5.430211e+09
1     Lunes    3.425554e+09
2     Sabado    6.833202e+09
3   Domingo    6.566466e+09
4    Jueves    2.748710e+09
5    Martes    3.328418e+09
6  Miercoles    3.136277e+09

我想要做的是在接下来的几天内订购数据框'顺序:

weekday
Lunes
Martes
Miercoles
Jueves
Viernes
Sabado
Domingo

为此,我使用了以下代码:

df['weekday'] = pd.Categorical(df[['weekday']], categories=["Lunes", "Martes", "Miercoles", "Jueves", "Viernes", "Sabado", "Domingo"])

当我运行代码时,我收到此错误:

ValueError: Cannot setitem on a Categorical with a new category, set the categories first

我找不到足够的文档来解决此问题。你能帮助我吗?谢谢!

1 个答案:

答案 0 :(得分:1)

df[['weekday']]返回一个不正确的数据框。将系列列转换为分类。另外,使用ordered=True参数在分类列中建立顺序。

categories = np.array(
     ['Lunes', 'Martes', 'Miercoles', 'Jueves', 'Viernes', 'Sabado', 'Domingo']
)

df.weekday = pd.Categorical(df.weekday, categories=categories, ordered=True)
df.sort_values(by='weekday')

     weekday  venta_total_cy
1      Lunes    3.425554e+09
5     Martes    3.328418e+09
6  Miercoles    3.136277e+09
4     Jueves    2.748710e+09
0    Viernes    5.430211e+09
2     Sabado    6.833202e+09
3    Domingo    6.566466e+09