Padas,如何创建列为多维数组的数据集?

时间:2019-05-08 22:26:15

标签: python pandas

这就是我的数据

print(data)
>
array([[ 0.369  , -0.3396 ,  0.1017 , ...,  0.2164 , -0.11163, -0.6025 ],
       [ 0.548  , -0.2668 , -0.1425 , ..., -0.3198 , -0.599  ,  0.04703],
       [ 0.761  , -0.2515 ,  0.02998, ...,  0.04663, -0.3276 , -0.1771 ],
       ...,
       [ 0.2148 , -0.492  , -0.03586, ...,  0.1157 , -0.299  , -0.12   ],
       [ 0.775  , -0.2622 , -0.1372 , ...,  0.356  , -0.2673 , -0.1897 ],
       [ 0.775  , -0.2622 , -0.1372 , ...,  0.356  , -0.2673 , -0.1897 ]],
      dtype=float16)

我正在尝试使用此将其转换为大熊猫中的一列

dataset = pd.DataFrame(data,  index=[0])
print(dataset)

但是我得到这个错误

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/pandas/core/internals/managers.py in create_block_manager_from_blocks(blocks, axes)
   1652 
-> 1653         mgr = BlockManager(blocks, axes)
   1654         mgr._consolidate_inplace()

7 frames
ValueError: Shape of passed values is (267900, 768), indices imply (1, 768)

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/pandas/core/internals/managers.py in construction_error(tot_items, block_shape, axes, e)
   1689         raise ValueError("Empty data passed with indices specified.")
   1690     raise ValueError("Shape of passed values is {0}, indices imply {1}".format(
-> 1691         passed, implied))
   1692 
   1693 

ValueError: Shape of passed values is (267900, 768), indices imply (1, 768)

似乎棘手的部分是将整个数组作为行条目。

有人建议

“删除索引数据集= pd.DataFrame(data)”

但是,这不能获得理想的结果。结果是这样的

dataset = pd.DataFrame(embeds16[:,0])
dataset.head()


0   1   2   3   4   5   6   7   8   9   ... 758 759 760 761 762 763 764 765 766 767
0   0.368896    -0.339600   0.101685    0.679199    -0.201904   -0.247192   -0.032776   -0.057098   0.287354    -0.356689   ... 0.064453    0.548340    -0.047729   -0.615723   -0.225464   -0.071106   -0.254395   0.216431    -0.111633   -0.602539
1   0.547852    -0.266846   -0.142456   1.327148    -0.135254   -0.376953   -0.221069   -0.273926   -0.099609   -0.146118   ... 0.138184    0.446777    -0.577637   0.051300    0.187378    0.171021    0.079163    -0.319824   -0.599121   0.047028
2   0.761230    -0.251465   0.029984    1.008789    -0.311279   -0.419922   -0.015869   -0.019196   0.016174    -0.284424   ... 0.152100    0.452881    -0.265381   -0.272949   0.029831    0.002472    0.186646    0.046631    -0.327637   -0.177124
3   0.690918    -0.374756   -0.008820   0.869141    -0.496582   -0.546875   0.060028    0.139893    -0.032471   -0.120361   ... 0.040314    0.391113    -0.420898   -0.342285   0.191650    0.350830    0.083130    0.028137    -0.488525   -0.157349
4   0.583008    -0.342529   -0.073608   0.683105    -0.071777   -0.390137   -0.174316   0.154541    0.170410    -0.184692   ... 0.326416    0.450928    0.083923    -0.331299   -0.207520   

我希望将整个数组放在单个列中,而不是分布在多个

1 个答案:

答案 0 :(得分:1)

您是说

pd.Series(a.tolist())

更新

pd.Series([x for x in a])