我的输入数据如下-
Customer_ID,General,General
Channel,Nominal,Character
WeekDateSunday,Discrete,Numeric
RevenueWeekN01,Continuous,Numeric
RevenueWeekN02,Continuous,Numeric
RevenueWeekN03,Continuous,Numeric
RevenueWeekN04,Continuous,Numeric
RevenueWeekN05,Continuous,Numeric
RevenueWeekN06,Continuous,Numeric
RevenueWeekN07,Continuous,Numeric
RevenueWeekN08,Continuous,Numeric
我只需要添加一列就需要以下数据(此列是基于第3列的structField):
Customer_ID,General,General, StructFieldType
Channel,Nominal,Character, StructField(Channel,StringType(), True)
WeekDateSunday,Discrete,Numeric, StructField(WeekDateSunday,DoubleType(), True)
RevenueWeekN01,Continuous,Numeric, StructField(RevenueWeekN01,DoubleType(), True)
RevenueWeekN02,Continuous,Numeric, StructField(RevenueWeekN02,DoubleType(), True)
RevenueWeekN03,Continuous,Numeric, StructField(RevenueWeekN03,DoubleType(), True)
RevenueWeekN04,Continuous,Numeric, StructField(RevenueWeekN04,DoubleType(), True)
RevenueWeekN05,Continuous,Numeric, StructField(RevenueWeekN05,DoubleType(), True)
RevenueWeekN06,Continuous,Numeric, StructField(RevenueWeekN06,DoubleType(), True)
RevenueWeekN07,Continuous,Numeric StructField(RevenueWeekN06,DoubleType(), True)
RevenueWeekN08,Continuous,Numeric StructField(RevenueWeekN06,DoubleType(), True)
以下是我使用的代码,对吗?
data_type.withColumn('structformat',when(col("Description") == 'Numeric', StructField(col("Field_Name"),DoubleType(), True)).otherwise(StructField(col("Field_Name"),StringType(), True)).show()
执行时抛出以下错误-
AssertionError: field name should be string
答案 0 :(得分:-1)
错误可能是您使用单引号将其更改为双引号,并且可以消除错误
data_type.withColumn("structformat",when(col("Description") == "Numeric", StructField(col("Field_Name"),DoubleType(), True)).otherwise(StructField(col("Field_Name"),StringType(), True)).show()
仍然遇到任何问题时请发表评论,如果有帮助,请批准答案。
编辑:
Customer_ID,General,General, StructFieldType
Channel,Nominal,Character, StructField("Channel",StringType(), True)
WeekDateSunday,Discrete,Numeric, StructField("WeekDateSunday",DoubleType(), True)
RevenueWeekN01,Continuous,Numeric, StructField("RevenueWeekN01",DoubleType(), True)
RevenueWeekN02,Continuous,Numeric, StructField("RevenueWeekN02",DoubleType(), True)
RevenueWeekN03,Continuous,Numeric, StructField("RevenueWeekN03",DoubleType(), True)
RevenueWeekN04,Continuous,Numeric, StructField("RevenueWeekN04",DoubleType(), True)
RevenueWeekN05,Continuous,Numeric, StructField("RevenueWeekN05",DoubleType(), True)
RevenueWeekN06,Continuous,Numeric, StructField("RevenueWeekN06",DoubleType(), True)
RevenueWeekN07,Continuous,Numeric StructField("RevenueWeekN06",DoubleType(), True)
RevenueWeekN08,Continuous,Numeric StructField("RevenueWeekN06",DoubleType(), True)
尝试一次