我正在尝试遵循https://medium.com/linagora-engineering/making-image-classification-simple-with-spark-deep-learning-f654a8b876b8提供的代码,但是使用我自己的2类图像集。
运行以下代码时遇到错误:
from pyspark.ml.classification import LogisticRegression
from pyspark.ml import Pipeline
from sparkdl import DeepImageFeaturizer
featurizer = DeepImageFeaturizer(inputCol="image", outputCol="features", modelName="InceptionV3")
lr = LogisticRegression(maxIter=20, regParam=0.05, elasticNetParam=0.3, labelCol="label")
p = Pipeline(stages=[featurizer, lr])
p_model = p.fit(train_df)
错误开始于:
Py4JJavaError Traceback (most recent call last)
<ipython-input-9-5b22e134f3f4> in <module>()
6 lr = LogisticRegression(maxIter=20, regParam=0.05, elasticNetParam=0.3, labelCol="label")
7 p = Pipeline(stages=[featurizer, lr])
----> 8 p_model = p.fit(train_df)
...
该错误还具有:
AttributeError: 'NoneType' object has no attribute 'mode'
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:452)
at org.apache.spark.sql.execution.python.PythonUDFRunner$$anon$1.read(PythonUDFRunner.scala:81)
at org.apache.spark.sql.execution.python.PythonUDFRunner$$anon$1.read(PythonUDFRunner.scala:64)
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:406)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
...
我不确定是什么问题。有人可以指出我的问题可能是什么/在哪里开始研究吗?