org.apache.spark.ml.feature.IDF错误

时间:2015-12-01 09:34:14

标签: scala apache-spark apache-spark-mllib

正如http://spark.apache.org/docs/latest/ml-features.html

中所述
import org.apache.spark.ml.feature.{HashingTF, IDF, Tokenizer}

Spark显示

scala> import org.apache.spark.ml.feature.IDF
<console>:13: error: object IDF is not a member of package org.apache.spark.ml.feature
       import org.apache.spark.ml.feature.IDF

然而,import org.apache.spark.mllib.feature.IDF工作正常。

出错的原因。我是新来的火花和斯卡拉。

2 个答案:

答案 0 :(得分:1)

这在spark-1.4.1中无法重现。你使用的是哪个版本?

scala> import org.apache.spark.ml.feature.IDF
import org.apache.spark.ml.feature.IDF

scala> import org.apache.spark.ml.feature.{HashingTF, IDF, Tokenizer}
import org.apache.spark.ml.feature.{HashingTF, IDF, Tokenizer}

<强> EDIT1

Spark 1.2.x仅包含:org.apache.spark.mllib.feature.IDF

尝试在此处搜索IDF:https://spark.apache.org/docs/1.2.0/api/scala/index.html#org.apache.spark.mllib.feature.IDF

答案 1 :(得分:1)

错误的原因是feature.IDF类已通过spark 1.4引入spark-ml。因此object IDF is not a member of package org.apache.spark.ml.feature错误。

您可以尝试使用spark-mllib IDF类。