我有两个pd数据帧:一个包含我使用sklearn.preprocessing StandardScaler缩放的连续变量,另一个包含我使用pd.get_dummies进行虚拟化的分类特征。它们都具有相同数量的行(5443)但具有不同数量的列 当我尝试连接这两个数据帧时,按轴= 1(列),Python抛出
ValueError: Shape of passed values is (101, 5936), indices imply (101, 5443)
我检查了数据框的两种形状,我也使用了np.concatenate函数,该函数可以正常工作,但是它会弄乱我对类别特征的伪编码
scaled.shape
Out[378]: (5443, 18)
cat_feats.shape
Out[379]: (5443, 83)
test = pd.concat([scaled,cat_feats],axis=1)
Traceback (most recent call last):
File "<ipython-input-380-f192aab9181d>", line 1, in <module>
test = pd.concat([scaled,cat_feats],axis=1)
File "C:\Users\Marlies\Anaconda3\lib\site-packages\pandas\core\reshape\concat.py", line 213, in concat
return op.get_result()
File "C:\Users\Marlies\Anaconda3\lib\site-packages\pandas\core\reshape\concat.py", line 408, in get_result
copy=self.copy)
File "C:\Users\Marlies\Anaconda3\lib\site-packages\pandas\core\internals.py", line 5207, in concatenate_block_managers
return BlockManager(blocks, axes)
File "C:\Users\Marlies\Anaconda3\lib\site-packages\pandas\core\internals.py", line 3033, in __init__
self._verify_integrity()
File "C:\Users\Marlies\Anaconda3\lib\site-packages\pandas\core\internals.py", line 3244, in _verify_integrity
construction_error(tot_items, block.shape[1:], self.axes)
File "C:\Users\Marlies\Anaconda3\lib\site-packages\pandas\core\internals.py", line 4608, in construction_error
passed, implied))
ValueError: Shape of passed values is (101, 5936), indices imply (101, 5443)
结果应为形状为(5443,101)的数据框