将数组的熊猫列转换为重塑的np.array

时间:2018-09-26 07:12:51

标签: python pandas numpy matrix

我有以下熊猫数据框:

col1    col2    col3

    2   [0.006576649077136777, 0.0030259599523339924, ...   [0.00567579212503948, -0.005498750236370691, 0...   [-0.015786838188947716, 0.0042899171402874135,...
    3   [-0.44547847984244543, -0.4482984342731749, 0....   [-0.022185524120646238, -0.38181444829591676, ...   [-0.015786838188947716, 0.0042899171402874135,...
    4   [-0.0014395623253532755, 0.0030226929032053595...   [0.0035941013456355217, 0.0047566422661906695,...   [-0.015786838188947716, 0.0042899171402874135,...
    5   [0.00967978470638314, 0.011863989585765296, -0...   [-0.011894506407398607, -0.003642684750775637,...   [-0.015786838188947716, 0.0042899171402874135,...

这些代表我将必须适合DBSCAN的向量。但是我很难将它们转换成矩阵,因为每次尝试都会将它们转换成数组矩阵,如何将它们转换成90 x 7190矩阵?

这是我尝试过的:

np.asarray(df_vec[[col for col in df_vec.columns.values]]).reshape((90,7190))

这就是我所拥有的

array([[array([ 0.00657665,  0.00302596, -0.01135427, -0.00063256, -0.00735737,
        0.00150661,  0.00318936,  0.00109255,  0.00557719,  0.00958158,
        0.00103098,  0.00706684,  0.00597235, -0.00502784,  0.00395275,
        0.01183221, -0.00067338,  0.0042127 , -0.00281012, -0.00501378,
       -0.00103368, -0.00374887,  0.01158366,  0.00259053, -0.00764409,
       -0.00156182, -0.0018044 ,  0.01153042,  0.00258852,  0.00294213]),
        array([-0.44547848, -0.44829843,  0.42276216, -0.22452319, -0.36380471,

1 个答案:

答案 0 :(得分:0)

问题尚不完全清楚,但是如果我理解正确,那么您可以:

df = pd.DataFrame({
        "col1": [np.random.rand(7190) for i in range(30)],
        "col2": [np.random.rand(7190) for i in range(30)],
        "col3": [np.random.rand(7190) for i in range(30)]
        })

现在,您只需要拆开堆栈即可获取数据集,并准备进行DBSCAN:

unstacked = np.array(df.unstack().tolist())

这将为您提供所需的形状:

print(unstacked.shape)

(90, 7190)