我的代码出现以下错误,您能告诉我原因吗?
笔记本:28:错误:未找到:值月份 retail_df = retail_df.withColumn(" Month",month(retail_df(" Date")))
笔记本:29:错误:未找到:价值年份 retail_df = retail_df.withColumn(" Year",year(retail_df(" Date")))
import org.apache.spark.sql.types._
// Make cutom schema
var schema = StructType(Array(
StructField("Store", IntegerType, true),
StructField("DayOfWeek", IntegerType, true),
StructField("Date", DateType, true),
StructField("Sales", IntegerType, true),
StructField("Customers", IntegerType, true),
StructField("Open", IntegerType, true),
StructField("Promo", IntegerType, true),
StructField("StateHoliday", StringType, true),
StructField("SchoolHoliday", StringType, true)))
val retail_dfr = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").schema(schema)
var retail_df = retail_dfr.load("/FileStore/tables/Rossman/train.csv")
val sales_custs_df = retail_df.select( "Store", "Sales", "Customers" )
val retails_open_df = retail_df.where( retail_df("Open") > 0)
val holidays_df = retail_df.filter(($"StateHoliday" === 1) && ($"SchoolHoliday" === 1))
val store_ids = retail_df.select(retail_df("Store")).distinct()
var weekday_promos = retail_df.stat.crosstab( "DayOfWeek" , "Promo" )
weekday_promos = weekday_promos.withColumnRenamed( "DayOfWeek_Promo", "DayOfWeek" )
.withColumnRenamed( "0", "NoPromo" )
.withColumnRenamed( "1","Promo" )
retail_df = retail_df.withColumn("Month", month(retail_df("Date")))
retail_df = retail_df.withColumn("Year", year(retail_df("Date")))
retail_df.show(5)
答案 0 :(得分:1)
需要导入month
和year
才能使用。要导入它们,请使用
import org.apache.spark.sql.functions.{month, year}
或
import org.apache.spark.sql.functions._
导入所有可用的sql函数。有关可用内容的更多信息可以在here找到。