当我想根据列连接两个数据帧时,pd.concat引发ValueError

时间:2019-07-11 13:58:52

标签: python pandas valueerror

我有两个pd数据帧:一个包含我使用sklearn.preprocessing StandardScaler缩放的连续变量,另一个包含我使用pd.get_dummies进行虚拟化的分类特征。它们都具有相同数量的行(5443)但具有不同数量的列 当我尝试连接这两个数据帧时,按轴= 1(列),Python抛出

ValueError: Shape of passed values is (101, 5936), indices imply (101, 5443)

我检查了数据框的两种形状,我也使用了np.concatenate函数,该函数可以正常工作,但是它会弄乱我对类别特征的伪编码

scaled.shape
Out[378]: (5443, 18)

cat_feats.shape
Out[379]: (5443, 83)

test = pd.concat([scaled,cat_feats],axis=1)
Traceback (most recent call last):

  File "<ipython-input-380-f192aab9181d>", line 1, in <module>
    test = pd.concat([scaled,cat_feats],axis=1)

  File "C:\Users\Marlies\Anaconda3\lib\site-packages\pandas\core\reshape\concat.py", line 213, in concat
    return op.get_result()

  File "C:\Users\Marlies\Anaconda3\lib\site-packages\pandas\core\reshape\concat.py", line 408, in get_result
    copy=self.copy)

  File "C:\Users\Marlies\Anaconda3\lib\site-packages\pandas\core\internals.py", line 5207, in concatenate_block_managers
    return BlockManager(blocks, axes)

  File "C:\Users\Marlies\Anaconda3\lib\site-packages\pandas\core\internals.py", line 3033, in __init__
    self._verify_integrity()

  File "C:\Users\Marlies\Anaconda3\lib\site-packages\pandas\core\internals.py", line 3244, in _verify_integrity
    construction_error(tot_items, block.shape[1:], self.axes)

  File "C:\Users\Marlies\Anaconda3\lib\site-packages\pandas\core\internals.py", line 4608, in construction_error
    passed, implied))

ValueError: Shape of passed values is (101, 5936), indices imply (101, 5443)

结果应为形状为(5443,101)的数据框

0 个答案:

没有答案