PySpark TypeError:“ ParamGridBuilder”类型的对象没有len()

时间:2019-02-08 16:54:36

标签: pyspark apache-spark-ml

我正在尝试使用Pyspark在Databricks上调整模型。

我收到以下错误: TypeError:“ ParamGridBuilder”类型的对象没有len()

我的代码已在下面列出。

let obj = { 
  key1: { 
    name: 'Steve',
    position: 1
  },
  key2: { 
    name: 'Bob',
    position: '2'
  }
}

let array = Object.values(obj)
console.log(array)

TypeError:类型为'ParamGridBuilder'的对象没有len()

完整错误日志:

from pyspark.ml.recommendation import ALS
from pyspark.ml.evaluation import RegressionEvaluator



als = ALS(userCol = "userId",itemCol="movieId", ratingCol="rating",  coldStartStrategy="drop", nonnegative = True, implicitPrefs = False)

# Imports ParamGridBuilder package
from pyspark.ml.tuning import ParamGridBuilder 

# Creates a ParamGridBuilder, and adds hyperparameters
param_grid = ParamGridBuilder().addGrid(als.rank, [5,10,20,40]).addGrid(als.maxIter, [5,10,15,20]).addGrid(als.regParam,[0.01,0.001,0.0001,0.02]) 

evaluator = RegressionEvaluator(metricName="rmse", labelCol="rating",predictionCol="prediction")

# Imports CrossValidator package
from pyspark.ml.tuning import CrossValidator 

# Creates cross validator and tells Spark what to use when training and evaluates
cv = CrossValidator(estimator = als,
                    estimatorParamMaps = param_grid,
                    evaluator = evaluator,
                    numFolds = 5) 

model = cv.fit(training) 

1 个答案:

答案 0 :(得分:0)

它简单意味着您的对象没有length属性(与列表不同)。因此,在您的行

param_grid = ParamGridBuilder()
    .addGrid(als.rank, [5,10,20,40])
    .addGrid(als.maxIter, [5,10,15,20])
    .addGrid(als.regParam, [0.01,0.001,0.0001,0.02])

您应该在末尾添加.build()才能真正构建网格。