我正在尝试将一些代码从python
转换为scala
。我特意想要生成与SpectralEmbedding
生成的结果相同的结果。目前我正在调查LaplacianEigenmap
我遇到的问题是我无法从python获得一致的结果!运行以下代码会根据操作系统提供不同的结果。
import numpy as np
import pandas as pd
from sklearn.manifold import SpectralEmbedding
data=np.array([
['',1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,'timewindow'],
['acai',5,4,13,8,6,2,6,8,2,6,1,9,7,8,6,9,8,5,5,11,1,7,3,8,1],
['acerola',3,0,0,1,0,0,1,1,0,1,0,1,0,1,0,1,0,1,0,0,0,1,0,1,1],
['adzuki beans',0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1],
['agar',2,0,1,2,0,0,1,3,2,1,4,2,0,1,2,4,5,8,7,2,4,3,5,24,1],
['agave',5,2,7,6,4,1,9,4,4,4,6,5,10,7,4,8,10,7,2,11,8,9,8,6,1],
['ajwain',0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1],
['alfalfa',1,1,2,1,0,1,1,1,0,1,2,4,3,3,1,0,0,2,2,1,0,1,2,0,1],
['algae',0,1,0,3,1,0,0,1,0,0,0,0,1,0,2,0,1,1,0,1,0,0,0,0,1],
['allspice',0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,2,2,2,1,1],
['almond',194,140,329,276,207,268,310,331,363,332,378,421,503,487,499,559,581,553,602,637,573,577,641,554,1],
['aloe vera',0,1,1,2,1,1,1,1,2,1,1,3,2,2,2,1,5,2,1,1,1,4,1,0,1],
['amaranth',2,0,2,0,0,1,1,2,0,0,0,2,0,3,1,2,1,0,2,2,3,1,3,0,1],
['amari',0,0,1,1,0,0,1,0,1,1,2,1,1,0,5,1,0,3,1,0,0,1,1,1,1],
['angostura',0,0,2,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1],
['angustifolia',0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1],
['anise',2,0,6,3,0,1,3,2,0,0,4,6,2,3,2,6,2,4,1,2,2,0,3,6,1],
['apple cider vinegar',0,0,0,0,0,0,0,0,0,1,0,1,0,1,0,0,1,2,0,0,1,0,0,0,1],
['apple',157,108,212,203,127,170,232,256,334,419,431,327,376,366,415,407,355,327,359,404,555,694,580,402,1],
['apricot',9,3,8,14,2,12,22,10,8,12,8,15,15,11,12,17,12,14,26,19,9,23,13,20,1],
['arjuna',0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1]
])
columns = 24
df = pd.DataFrame(data=data[1:,1:],
index=data[1:,0],
columns=data[0,1:])
model = SpectralEmbedding(n_components=2, random_state=13,eigen_solver="arpack")
np.set_printoptions(suppress=True)
LE_model = model.fit_transform(df.ix[:,0:columns])
df_reduced = pd.DataFrame(LE_model)
print(df_reduced)
当我在Docker(Linux)上运行的Jupiter笔记本中使用python2或python3运行时,我得到:
0 1
0 0.016043 -0.050914
1 0.048082 0.016466
2 0.192327 0.065864
3 0.210040 -0.043099
4 0.024065 -0.076371
5 0.192327 0.065864
6 0.840161 -0.172394
7 0.096164 0.032932
8 0.048082 0.016466
9 0.110486 0.687106
10 0.210040 -0.043099
11 0.420081 -0.086197
12 0.048082 0.016466
13 0.048082 0.016466
14 0.096164 0.032932
15 0.420081 -0.086197
16 0.096164 0.032932
17 0.110486 0.687106
18 0.008022 -0.025457
19 0.096164 0.032932
当我在Windows上使用来自Intellij的python3时:
0 1
0 0.369416 -0.031546
1 -0.023761 -0.030871
2 -0.095044 -0.123483
3 0.100685 0.176579
4 0.554124 -0.047319
5 -0.095044 -0.123483
6 0.402739 0.706316
7 -0.047522 -0.061742
8 -0.023761 -0.030871
9 0.427355 -0.409445
10 0.100685 0.176579
11 0.201370 0.353158
12 -0.023761 -0.030871
13 -0.023761 -0.030871
14 -0.047522 -0.061742
15 0.201370 0.353158
16 -0.047522 -0.061742
17 0.427355 -0.409445
18 0.184708 -0.015773
19 -0.047522 -0.061742
如果我无法从python获得一致的结果,我将无法确保我的scala代码按预期工作。
是否存在差异的原因?
是否有基于JVM的SpectralEmbedding
版本与python版本一致?