在PySpark本地节点中的https://bigdl-project.github.io/0.4.0/#ProgrammingGuide/optimization/运行BigDL示例:
from bigdl.nn.layer import Linear
from bigdl.util.common import *
from bigdl.nn.criterion import MSECriterion
from bigdl.optim.optimizer import Optimizer, MaxIteration
import numpy as np
sc = SparkContext(appName="simple",conf=create_spark_conf())
init_engine()
model = Linear(2, 1)
samples = [
Sample.from_ndarray(np.array([5, 5]), np.array([2.0])),
Sample.from_ndarray(np.array([-5, -5]), np.array([-2.0])),
Sample.from_ndarray(np.array([-2, 5]), np.array([1.3])),
Sample.from_ndarray(np.array([-5, 2]), np.array([0.1])),
Sample.from_ndarray(np.array([5, -2]), np.array([-0.1])),
Sample.from_ndarray(np.array([2, -5]), np.array([-1.3]))
]
train_data = sc.parallelize(samples, 1)
optimizer = Optimizer(model, train_data, MSECriterion(), MaxIteration(100), 4)
optimizer.optimize()
model.get_weights()[0]
导致以下异常。其他BigDL测试在PySpark中运行。环境:openjdk版本“1.8.0_141,Python 3.5.3(默认,2017年1月19日,14:11:04) Linux上的[GCC 6.3.0 20170118]
有什么想法吗? BigDL是一个现场项目,积极维护吗?
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
2018-02-28 22:40:20 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-02-28 22:40:20 WARN Utils:66 - Your hostname, dk resolves to a loopback address: 127.0.1.1; using 10.0.2.15 instead (on interface enp0s3)
2018-02-28 22:40:20 WARN Utils:66 - Set SPARK_LOCAL_IP if you need to bind to another address
2018-02-28 22:40:24 WARN SparkContext:66 - Using an existing SparkContext; some configuration may not take effect.
cls.getname: com.intel.analytics.bigdl.python.api.Sample
BigDLBasePickler registering: bigdl.util.common Sample
cls.getname: com.intel.analytics.bigdl.python.api.EvaluatedResult
BigDLBasePickler registering: bigdl.util.common EvaluatedResult
cls.getname: com.intel.analytics.bigdl.python.api.JTensor
BigDLBasePickler registering: bigdl.util.common JTensor
cls.getname: com.intel.analytics.bigdl.python.api.JActivity
BigDLBasePickler registering: bigdl.util.common JActivity
disableCheckSingleton is deprecated. Please use bigdl.check.singleton instead
/usr/local/lib/python3.5/dist-packages/bigdl/util/engine.py:41: UserWarning: Find both SPARK_HOME and pyspark. You may need to check whether they match with each other. SPARK_HOME environment variable is set to: /opt/spark, and pyspark is found in: /usr/local/lib/python3.5/dist-packages/pyspark/__init__.py. If they are unmatched, please use one source only to avoid conflict. For example, you can unset SPARK_HOME and use pyspark only.
warnings.warn(warning_msg)
Prepending /usr/local/lib/python3.5/dist-packages/bigdl/share/conf/spark-bigdl.conf to sys.path
creating: createLinear
creating: createMSECriterion
creating: createMaxIteration
creating: createDefault
creating: createSGD
creating: createDistriOptimizer
Traceback (most recent call last):
File "simple.py", line 22, in <module>
optimizer.optimize()
File "/usr/local/lib/python3.5/dist-packages/bigdl/optim/optimizer.py", line 591, in optimize
jmodel = callJavaFunc(get_spark_context(), self.value.optimize)
File "/usr/local/lib/python3.5/dist-packages/bigdl/util/common.py", line 590, in callJavaFunc
result = func(*args)
File "/usr/local/lib/python3.5/dist-packages/py4j/java_gateway.py", line 1133, in __call__
answer, self.gateway_client, self.target_id, self.name)
File "/usr/local/lib/python3.5/dist-packages/py4j/protocol.py", line 319, in get_return_value
format(target_id, ".", name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling o48.optimize.
: java.lang.ExceptionInInitializerError
at com.intel.analytics.bigdl.optim.DistriOptimizer.optimize(DistriOptimizer.scala:860)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:280)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:214)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException
at java.util.concurrent.ThreadPoolExecutor.<init>(ThreadPoolExecutor.java:1314)
at java.util.concurrent.ThreadPoolExecutor.<init>(ThreadPoolExecutor.java:1237)
at java.util.concurrent.Executors.newFixedThreadPool(Executors.java:151)
at com.intel.analytics.bigdl.parameters.AllReduceParameter$.<init>(AllReduceParameter.scala:47)
at com.intel.analytics.bigdl.parameters.AllReduceParameter$.<clinit>(AllReduceParameter.scala)
... 12 more
答案 0 :(得分:0)
是BIGDL
是积极维护的。定义bigdl模型的正确方法是使用sequential API
或functional API
顺序API
model = Sequential()
model.add(Linear(...))
model.add(Sigmoid())
model.add(Softmax())
功能API
linear = Linear(...)()
sigmoid = Sigmoid()(linear)
softmax = Softmax()(sigmoid)
model = Model([linear], [softmax])
请参阅here。
答案 1 :(得分:0)
我刚刚开始使用BigDL。我使用PySpark,并注意到即使默认函数调用也会失败。我从字面上钻研了源代码,在那儿阅读了文档,然后根据阅读的内容更改了我的调用方式。
这样做可能会帮助您。从您发布的错误来看,它似乎不喜欢将某些参数传递给它。这不是一个“您”的问题,而是一个“文档中没有代码的在线”问题。