从AWS S3加载已保存的Scala XGBoost模型时出现问题

时间:2017-08-04 08:05:53

标签: scala apache-spark amazon-s3 xgboost

我在从AWS S3加载已保存的Scala XGBoost模型时遇到问题。以下是我的代码。问题是我能够在AWS S3中保​​存Scala XGBoost模型,但无法从AWS S3加载模型。

val trainingData = sqlContext.read.parquet(path1)

val testData = sqlContext.read.parquet(path2)

val OOTvalData = sqlContext.read.parquet(path3)

// number of iterations
val numRound = 200
val numWorkers = 4
// training parameters
val paramMap = List("eta" -> 0.023f,"max_depth" -> 6,"min_child_weight" -> 3.0,"subsample" -> 1.0,"colsample_bytree" -> 0.82,"colsample_bylevel" -> 0.9,"base_score" -> 0.005,"eval_metric" -> "auc","seed" -> 8,"silent" -> 1,"objective" -> "binary:logistic").toMap

println("Starting Xgboost ")

val xgBoostModelWithDF = XGBoost.trainWithDataFrame(path1, paramMap, round = numRound, nWorkers = numWorkers, useExternalMemory = true)

xgBoostModelWithDF.write.overwrite().save(path4)


#### I am getting error at the below step to load the model from S3 location
xgBoostModelWithDF1 = XGBoost.load(path4)

2 个答案:

答案 0 :(得分:0)

我正在使用python,我做了两件你不做的事。

  1. 我从S3读取对象并使用.read(),然后我把它放入pythons BROKER_HEARTBEAT
  2. 我也初始化了一个助推器。
  3. 这是python示例,希望您可以将其转换为Scala

    bytearray

答案 1 :(得分:0)

您应该使用XGBoostModel加载模型,而不是XGBoost