我正在尝试使用Scala和Apache Spark上传一个csv文件,但是,一旦使用Spark Structype指定了架构,我就会遇到这个问题,试图指出csv文件的标题-
scala> import org.apache.spark
import org.apache.spark
scala> import org.apache.spark.sql
import org.apache.spark.sql
scala> import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.SQLContext
scala> import org.apache.spark.sql.types
import org.apache.spark.sql.types
scala> import org.apache.spark.sql.functions
import org.apache.spark.sql.functions
scala> import org.apache.spark.ml.clustering.KMeans
import org.apache.spark.ml.clustering.KMeans
scala> import org.apache.spark.ml.evaluation.ClusteringEvaluator
import org.apache.spark.ml.evaluation.ClusteringEvaluator
scala> import org.apache.spark.ml.feature.VectorAssembler
import org.apache.spark.ml.feature.VectorAssembler
scala> val sqlContext = new SQLContext(sc)
warning: there was one deprecation warning; re-run with -deprecation for details
sqlContext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@f24a84
scala> import sqlContext.implicits
import sqlContext.implicits
scala> import sqlContext
| val schema = StructType(Array(StructField("ID_CALLE",IntegerType,true),StructField("TIPO", IntegerType, true),StructField("CALLE",IntegerType,true),StructField("NUMERO",IntegerType,true), StructField("LONGITUD",DoubleType,true),StructField("LATITUD",DoubleType,true),StructField("TITULO",IntegerType,true)))
<console>:2: error: '.' expected but ';' found.
val schema = StructType(Array(StructField("ID_CALLE",IntegerType,true),StructField("TIPO", IntegerType, true),StructField("CALLE",IntegerType,true),StructField("NUMERO",IntegerType,true), StructField("LONGITUD",DoubleType,true),StructField("LATITUD",DoubleType,true),StructField("TITULO",IntegerType,true)))
答案 0 :(得分:0)
您的代码中有小的拼写错误。如果您仔细查看代码,就会发现以下错误
scala> import sqlContext
| val schema = StructType(Array(StructField("ID_CALLE",IntegerType,true),StructField("TIPO", IntegerType, true),StructField("CALLE",IntegerType,true),StructField("NUMERO",IntegerType,true), StructField("LONGITUD",DoubleType,true),StructField("LATITUD",DoubleType,true),StructField("TITULO",IntegerType,true)))
您在每个地方都只在scala>
之后键入新的代码行,但是在上面的代码中,您只是在|
之后键入
所以只需键入以下代码
scala> import sqlContext._
scala> val schema = StructType(Array(StructField("ID_CALLE",IntegerType,true),StructField("TIPO", IntegerType, true),StructField("CALLE",IntegerType,true),StructField("NUMERO",IntegerType,true), StructField("LONGITUD",DoubleType,true),StructField("LATITUD",DoubleType,true),StructField("TITULO",IntegerType,true)))