Spark Scala年/月功能错误:未找到

时间:2018-03-27 05:21:15

标签: scala apache-spark

我的代码出现以下错误,您能告诉我原因吗?

  

笔记本:28:错误:未找到:值月份   retail_df = retail_df.withColumn(" Month",month(retail_df(" Date")))

     

笔记本:29:错误:未找到:价值年份   retail_df = retail_df.withColumn(" Year",year(retail_df(" Date")))

import org.apache.spark.sql.types._

// Make cutom schema
var schema = StructType(Array(
       StructField("Store", IntegerType, true),
       StructField("DayOfWeek", IntegerType, true),
       StructField("Date", DateType, true),
       StructField("Sales", IntegerType, true),
       StructField("Customers", IntegerType, true),
       StructField("Open", IntegerType, true),
       StructField("Promo", IntegerType, true),
       StructField("StateHoliday", StringType, true),
       StructField("SchoolHoliday", StringType, true)))

val retail_dfr = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").schema(schema)
var retail_df = retail_dfr.load("/FileStore/tables/Rossman/train.csv")

val sales_custs_df = retail_df.select( "Store", "Sales", "Customers" )
val retails_open_df = retail_df.where( retail_df("Open") > 0)
val holidays_df = retail_df.filter(($"StateHoliday" === 1) && ($"SchoolHoliday" === 1))
val store_ids = retail_df.select(retail_df("Store")).distinct()
var weekday_promos = retail_df.stat.crosstab( "DayOfWeek" , "Promo" )

weekday_promos = weekday_promos.withColumnRenamed( "DayOfWeek_Promo", "DayOfWeek" )
                             .withColumnRenamed( "0", "NoPromo" )
                             .withColumnRenamed( "1","Promo" )

retail_df = retail_df.withColumn("Month", month(retail_df("Date")))
retail_df = retail_df.withColumn("Year", year(retail_df("Date")))

retail_df.show(5)

1 个答案:

答案 0 :(得分:1)

需要导入monthyear才能使用。要导入它们,请使用

import org.apache.spark.sql.functions.{month, year}

import org.apache.spark.sql.functions._

导入所有可用的sql函数。有关可用内容的更多信息可以在here找到。