PredictionIO UniversalRecommender elasticsearch错误

时间:2018-03-19 08:02:01

标签: postgresql apache-spark elasticsearch predictionio recommender-systems

我正在使用Prediction.io附带的Universal Recommender,并在运行./examples/integration-test脚本(找到here)时收到以下错误。

[INFO] [ContextHandler] Stopped o.s.j.s.ServletContextHandler@6ec63f8{/jobs,null,UNAVAILABLE,@Spark}
    Exception in thread "main" java.lang.IllegalStateException: No Elasticsearch client configuration detected, check your pio-env.sh forproper configuration settings
        at com.actionml.EsClient$$anonfun$client$2.apply(EsClient.scala:86)
        at com.actionml.EsClient$$anonfun$client$2.apply(EsClient.scala:86)
        at scala.Option.getOrElse(Option.scala:121)
        at com.actionml.EsClient$.client$lzycompute(EsClient.scala:85)
        at com.actionml.EsClient$.client(EsClient.scala:85)
        at com.actionml.EsClient$.createIndex(EsClient.scala:174)
        at com.actionml.EsClient$.hotSwap(EsClient.scala:271)
        at com.actionml.URModel.save(URModel.scala:82)
        at com.actionml.URAlgorithm.calcAll(URAlgorithm.scala:367)
        at com.actionml.URAlgorithm.train(URAlgorithm.scala:295)
        at com.actionml.URAlgorithm.train(URAlgorithm.scala:180)
        at org.apache.predictionio.controller.P2LAlgorithm.trainBase(P2LAlgorithm.scala:49)
        at org.apache.predictionio.controller.Engine$$anonfun$18.apply(Engine.scala:690)
        at org.apache.predictionio.controller.Engine$$anonfun$18.apply(Engine.scala:690)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at scala.collection.immutable.List.foreach(List.scala:381)
        at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
        at scala.collection.immutable.List.map(List.scala:285)
        at org.apache.predictionio.controller.Engine$.train(Engine.scala:690)
        at org.apache.predictionio.controller.Engine.train(Engine.scala:176)
        at org.apache.predictionio.workflow.CoreWorkflow$.runTrain(CoreWorkflow.scala:67)
        at org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:251)
        at org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:743)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

我的配置(PredictionIO/conf/pio-env.sh)如下所示:

#!/usr/bin/env bash
#

# PredictionIO Main Configuration
#
# This section controls core behavior of PredictionIO. It is very likely that
# you need to change these to fit your site.

# SPARK_HOME: Apache Spark is a hard dependency and must be configured.
# SPARK_HOME=$PIO_HOME/vendors/spark-2.0.2-bin-hadoop2.7
SPARK_HOME=$PIO_HOME/vendors/spark-2.1.1-bin-hadoop2.6

POSTGRES_JDBC_DRIVER=$PIO_HOME/lib/postgresql-42.0.0.jar
MYSQL_JDBC_DRIVER=$PIO_HOME/lib/mysql-connector-java-5.1.41.jar

# ES_CONF_DIR: You must configure this if you have advanced configuration for
#              your Elasticsearch setup.
# ES_CONF_DIR=/opt/elasticsearch

# HADOOP_CONF_DIR: You must configure this if you intend to run PredictionIO
#                  with Hadoop 2.
# HADOOP_CONF_DIR=/opt/hadoop

# HBASE_CONF_DIR: You must configure this if you intend to run PredictionIO
#                 with HBase on a remote cluster.
# HBASE_CONF_DIR=$PIO_HOME/vendors/hbase-1.0.0/conf

# Filesystem paths where PredictionIO uses as block storage.
PIO_FS_BASEDIR=$HOME/.pio_store
PIO_FS_ENGINESDIR=$PIO_FS_BASEDIR/engines
PIO_FS_TMPDIR=$PIO_FS_BASEDIR/tmp

# PredictionIO Storage Configuration
#
# This section controls programs that make use of PredictionIO's built-in
# storage facilities. Default values are shown below.
#
# For more information on storage configuration please refer to
# http://predictionio.apache.org/system/anotherdatastore/

# Storage Repositories

# Default is to use PostgreSQL
PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta
PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=PGSQL

PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event
PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=PGSQL

PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model
PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=PGSQL

# Storage Data Sources

# PostgreSQL Default Settings
# Please change "pio" to your database name in PIO_STORAGE_SOURCES_PGSQL_URL
# Please change PIO_STORAGE_SOURCES_PGSQL_USERNAME and
# PIO_STORAGE_SOURCES_PGSQL_PASSWORD accordingly
PIO_STORAGE_SOURCES_PGSQL_TYPE=jdbc
PIO_STORAGE_SOURCES_PGSQL_URL=jdbc:postgresql://localhost/pio
PIO_STORAGE_SOURCES_PGSQL_USERNAME=pio
PIO_STORAGE_SOURCES_PGSQL_PASSWORD=pio

# MySQL Example
# PIO_STORAGE_SOURCES_MYSQL_TYPE=jdbc
# PIO_STORAGE_SOURCES_MYSQL_URL=jdbc:mysql://localhost/pio
# PIO_STORAGE_SOURCES_MYSQL_USERNAME=pio
# PIO_STORAGE_SOURCES_MYSQL_PASSWORD=pio

# Elasticsearch Example
# PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
# PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost
# PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9200
# PIO_STORAGE_SOURCES_ELASTICSEARCH_SCHEMES=http
# PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=$PIO_HOME/vendors/elasticsearch-5.5.2
# Optional basic HTTP auth
# PIO_STORAGE_SOURCES_ELASTICSEARCH_USERNAME=my-name
# PIO_STORAGE_SOURCES_ELASTICSEARCH_PASSWORD=my-secret
# Elasticsearch 1.x Example
# PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
# PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=<elasticsearch_cluster_name>
# PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost
# PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300
# PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=$PIO_HOME/vendors/elasticsearch-1.7.6

# Local File System Example
# PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs
# PIO_STORAGE_SOURCES_LOCALFS_PATH=$PIO_FS_BASEDIR/models

# HBase Example
# PIO_STORAGE_SOURCES_HBASE_TYPE=hbase
# PIO_STORAGE_SOURCES_HBASE_HOME=$PIO_HOME/vendors/hbase-1.0.0

# AWS S3 Example
# PIO_STORAGE_SOURCES_S3_TYPE=s3
# PIO_STORAGE_SOURCES_S3_BUCKET_NAME=pio_bucket
# PIO_STORAGE_SOURCES_S3_BASE_PATH=pio_model

我正在尝试将PSQL用于所有三种类型的存储(元,事件和模型),所以不确定为什么我会被赋予错误RE elasticsearch?

我是否需要在某处运行elasticsearch?

  1. 列表项

1 个答案:

答案 0 :(得分:1)

actionml-user组论坛上提供的反馈:https://spin.atomicobject.com/2016/06/18/vertically-center-floated-elements-flexbox/

总之 - 虽然predictio为3“存储库”提供了不同数据源的许多选项,但通用推荐器(UR)引擎需要弹性搜索作为元数据存储。事件数据存储库理想地设置为HBASE(虽然我认为我看到有人发布了与Postgres合作的帖子)。 UR不会真正使用模型存储库,因此也可以将其配置为使用LOCALFS,这是我成功使用的配置。