映射大熊猫中的分类数据?

时间:2020-03-30 06:31:56

标签: python pandas

我想分别在熊猫中映射cat变量,

df = pd.DataFrame({'Users': ['123', '456', '789', '159', '789', '123', '159']})
df.Users.astype("category").cat.codes

Out[25]: 
0    0
1    2
2    3
3    1
4    3
5    0
6    1
dtype: int8

我需要分别传递用户,以记录标签以匹配类别。所以我尝试了

Users_types = ['123', '456', '789', '159']
df.Users.astype("category", categories=Users_types).cat.codes

但是我遇到了错误

"Got an unexpected argument: {}".format(deprecated_arg)
ValueError: Got an unexpected argument: categories

我该如何解决?

1 个答案:

答案 0 :(得分:1)

第一个解决方案是指定CategoricalDtype

Users_types = ['123', '456', '789', '159']

from pandas.api.types import CategoricalDtype
s = df.Users.astype(CategoricalDtype(categories=Users_types)).cat.codes

print (s)
0    0
1    1
2    2
3    3
4    2
5    0
6    3
dtype: int8

或使用Categorical

Users_types = ['123', '456', '789', '159']
s = pd.Categorical(df.Users, categories=Users_types).codes
print (s)
[0 1 2 3 2 0 3]