分类数据:通过添加新维度转换为二进制编码

时间:2019-03-04 23:27:36

标签: python pandas numpy categorical-data

我的数据集中的每个要素都具有三个类别,当前使用整数0、1和2进行编码。我希望将其编码为二进制编码,而不是一成不变的编码,其中0被[0, 0],将1替换为[0,1],将2替换为[1,1]。在不使用for循环的情况下该怎么办?

假设我的数据如下:

$ jupyter notebook
Traceback (most recent call last):
  File "/Users/danielavargasrobles/miniconda3/bin/jupyter-notebook", line 7, in <module>
    from notebook.notebookapp import main
  File "/Users/danielavargasrobles/miniconda3/lib/python3.6/site-packages/notebook/notebookapp.py", line 83, in <module>
    from .services.contents.manager import ContentsManager
  File "/Users/danielavargasrobles/miniconda3/lib/python3.6/site-packages/notebook/services/contents/manager.py", line 17, in <module>
    from nbformat import sign, validate as validate_nb, ValidationError
ModuleNotFoundError: No module named 'nbformat'

结果应添加一个尺寸-

  [[1, 2, 0],
   [2, 0, 1]]

1 个答案:

答案 0 :(得分:0)

检查

(a.ravel()[:,None]>np.arange(a.max())).astype(int)[:,::-1].reshape((2,-1,2))
Out[353]: 
array([[[0, 1],
        [1, 1],
        [0, 0]],
       [[1, 1],
        [0, 0],
        [0, 0]]])

a=np.array( [[1, 2, 0],
   [2, 0, 1]])