我有以下熊猫数据框:
col1 col2 col3
2 [0.006576649077136777, 0.0030259599523339924, ... [0.00567579212503948, -0.005498750236370691, 0... [-0.015786838188947716, 0.0042899171402874135,...
3 [-0.44547847984244543, -0.4482984342731749, 0.... [-0.022185524120646238, -0.38181444829591676, ... [-0.015786838188947716, 0.0042899171402874135,...
4 [-0.0014395623253532755, 0.0030226929032053595... [0.0035941013456355217, 0.0047566422661906695,... [-0.015786838188947716, 0.0042899171402874135,...
5 [0.00967978470638314, 0.011863989585765296, -0... [-0.011894506407398607, -0.003642684750775637,... [-0.015786838188947716, 0.0042899171402874135,...
这些代表我将必须适合DBSCAN的向量。但是我很难将它们转换成矩阵,因为每次尝试都会将它们转换成数组矩阵,如何将它们转换成90 x 7190矩阵?
这是我尝试过的:
np.asarray(df_vec[[col for col in df_vec.columns.values]]).reshape((90,7190))
这就是我所拥有的
array([[array([ 0.00657665, 0.00302596, -0.01135427, -0.00063256, -0.00735737,
0.00150661, 0.00318936, 0.00109255, 0.00557719, 0.00958158,
0.00103098, 0.00706684, 0.00597235, -0.00502784, 0.00395275,
0.01183221, -0.00067338, 0.0042127 , -0.00281012, -0.00501378,
-0.00103368, -0.00374887, 0.01158366, 0.00259053, -0.00764409,
-0.00156182, -0.0018044 , 0.01153042, 0.00258852, 0.00294213]),
array([-0.44547848, -0.44829843, 0.42276216, -0.22452319, -0.36380471,
答案 0 :(得分:0)
问题尚不完全清楚,但是如果我理解正确,那么您可以:
df = pd.DataFrame({
"col1": [np.random.rand(7190) for i in range(30)],
"col2": [np.random.rand(7190) for i in range(30)],
"col3": [np.random.rand(7190) for i in range(30)]
})
现在,您只需要拆开堆栈即可获取数据集,并准备进行DBSCAN:
unstacked = np.array(df.unstack().tolist())
这将为您提供所需的形状:
print(unstacked.shape)
(90, 7190)