pio火车:失败并失败

时间:2018-07-12 10:18:14

标签: scala recommendation-engine predictionio

我正在尝试使用来自以下article的PredictionIo的相似产品模板来构建简单的推荐引擎。

按照文章中所述导入示例数据并构建应用程序后,当我尝试通过运行以下命令pio train来训练模型时,它失败并显示以下错误:

Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: RDD[Rating] in PreparedData cannot be empty. Please check if DataSource generates TrainingData and Preparator generates PreparedData correctly.
    at scala.Predef$.require(Predef.scala:224)
    at org.example.recommendation.ALSAlgorithm.train(ALSAlgorithm.scala:35)
    at org.example.recommendation.ALSAlgorithm.train(ALSAlgorithm.scala:21)
    at org.apache.predictionio.controller.PAlgorithm.trainBase(PAlgorithm.scala:50)
    at org.apache.predictionio.controller.Engine$$anonfun$18.apply(Engine.scala:690)
    at org.apache.predictionio.controller.Engine$$anonfun$18.apply(Engine.scala:690)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
    at scala.collection.immutable.List.foreach(List.scala:381)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
    at scala.collection.immutable.List.map(List.scala:285)
    at org.apache.predictionio.controller.Engine$.train(Engine.scala:690)
    at org.apache.predictionio.controller.Engine.train(Engine.scala:176)
    at org.apache.predictionio.workflow.CoreWorkflow$.runTrain(CoreWorkflow.scala:67)
    at org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:251)
    at org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:743)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

如果我运行pio train --stop-after-readpio train --stop-after-prepare,它将运行完全正常而没有错误,并且只有在运行pio train时出现上述异常才会失败。

只是检查我是否能够从数据源中获取数据,我试图在 pio shell 中运行以下代码段:

import org.apache.predictionio.data.store.PEventStore
val eventsRDD = PEventStore.find(appName="MyUserRecommendation")(sc)
val c = eventsRDD.collect()

我得到以下输出:

[Stage 0:>                                                          (0 + 1) / 4]Thu Jul 12 15:45:05 IST 2018 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Thu Jul 12 15:45:06 IST 2018 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
[Stage 0:==============>                                            (1 + 1) / 4]Thu Jul 12 15:45:06 IST 2018 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Thu Jul 12 15:45:06 IST 2018 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
c: Array[org.apache.predictionio.data.storage.Event] = Array(Event(id=Some(01636823b4c54fa39130836903b5cd9c),event=view,eType=user,eId=u6,tType=Some(item),tId=Some(i46),p=DataMap(Map()),t=2018-07-12T06:18:58.000Z,tags=List(),pKey=None,ct=2018-07-12T06:18:58.000Z), Event(id=Some(037ffc6e80cd4997b507099ed8dcffa8),event=view,eType=user,eId=u5,tType=Some(item),tId=Some(i13),p=DataMap(Map()),t=2018-07-12T06:18:58.000Z,tags=List(),pKey=None,ct=2018-07-12T06:18:58.000Z), Event(id=Some(039bea7eea1b425c99f9d5cb6bde9022),event=view,eType=user,eId=u2,tType=Some(item),tId=Some(i14),p=DataMap(Map()),t=2018-07-12T06:18:57.000Z,tags=List(),pKey=None,ct=2018-07-12T06:18:57.000Z), Event(id=Some(071d0c909f1d499ca5e603edc088208a),event=view,eType=user,eId=u7,tType=Some(item),tId=Some(i43),p=DataMap(Map())...

因此,这意味着事件存储中有我可以成功获取的数据。谁能帮我解决我的问题?

0 个答案:

没有答案