RandomForest模型加载错误

时间:2015-10-13 14:35:44

标签: json hadoop elasticsearch apache-spark apache-spark-mllib

我的Spark程序从ElasticSearch(以JavaPairRDD>形式)读取数据。使用以下内部类

将此JavaPairRDD转换为JavaRDD
static class Transformer implements
        Function<Tuple2<String, Map<String, Object>>, LabeledPoint> {

    @Override
    public LabeledPoint call(Tuple2<String, Map<String, Object>> arg0)
            throws Exception {
        HashingTF tf = new HashingTF();
        Map<String, Object> map = (Map<String, Object>) arg0._2();

        // get values from Map
        Set<String> keys = map.keySet();
        List<Object> valuesList = new ArrayList<Object>();
        for (Iterator<String> i = keys.iterator(); i.hasNext();) {
            String key = (String) i.next();
            Object value = (Object) map.get(key);
            valuesList.add(value);
        }
        return new LabeledPoint(1d, tf.transform(valuesList));
    }
}

使用JavaRDD数据生成RandomForest模型并将其保存到同一台机器上的特定位置。模型成功保存。

但是在加载模型时我会遇到异常,

  

java.lang.reflect.InvocationTargetException at   sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at   sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)   在   sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)   在java.lang.reflect.Method.invoke(Method.java:497)at   org.codehaus.mojo.exec.ExecJavaMojo $ 1.run(ExecJavaMojo.java:293)at at   java.lang.Thread.run(Thread.java:745)引起:   org.json4s.package $ MappingException:没有找到可以的值   转换为java.lang.String at   org.json4s.reflect.package $ .fail(package.scala:96)at   org.json4s.Extraction $ .convert(Extraction.scala:554)at   org.json4s.Extraction $ .extract(Extraction.scala:331)at   org.json4s.Extraction $ .extract(Extraction.scala:42)at   org.json4s.ExtractableJsonAstNode.extract(ExtractableJsonAstNode.scala:21)   在   org.apache.spark.mllib.tree.model.DecisionTreeModel $ .load(DecisionTreeModel.scala:326)   在   org.apache.spark.mllib.tree.model.DecisionTreeModel.load(DecisionTreeModel.scala)   在   co.nttd.integration.SparkESIntegration.main(SparkESIntegration.java:96)

保存模型编写以下代码

JavaPairRDD<String, Map<String, Object>> esRDD = JavaEsSpark
            .esRDD(javaSparkContext);
    // Transform data in JavaRDD<LabeledPoint>
    JavaRDD<LabeledPoint> data = esRDD.map(new Transformer());

    final RandomForestModel model = RandomForest.trainClassifier(
                inputTrainingData, numClasses, categoricalFeaturesInfo, numTrees,
                featureSubsetStrategy, impurity, maxDepth, maxBins, seed);

    model.save(JavaSparkContext.toSparkContext(jsc),
            "myPath");

加载模型, DecisionTreeModel model = DecisionTreeModel.load(sc.sc(),“file:/// myPath”);

0 个答案:

没有答案