模式:
root
|-- col_a: struct (nullable = true)
| |-- $numberLong: string (nullable = true)
|-- col_b: string (nullable = true)
|-- col_c: struct (nullable = true)
| |-- $numberLong: string (nullable = true)
破坏(col_a)结构的代码
df = df.select($"col_a.*",$"col_b",$"col_c")
df.printSchema()
操作:
|-- $numberLong: string (nullable = true)
|-- col_b: string (nullable = true)
|-- col_c: struct (nullable = true)
| |-- $numberLong: string (nullable = true)
现在,当我尝试仅选择第一列(“ $ numberLong”)并将其重命名时
df = df.select($"$numberLong".as("test"))
我遇到以下错误:
error: not found: value numberLong
df = df.select($"$numberLong")
^
当该列明显存在时,我无法理解错误的原因。
答案 0 :(得分:0)
如果一列的列名前有$
,则不能用$"colName"
引用该列-即使您用backticks
括起了colName也不行。而是使用col("colName")
,如下所示:
case class A(`$numberLong`: String)
val df = Seq(
(A("x1"), "d1", A("y1")),
(A("x2"), "d2", A("y2")),
(A("x3"), "d3", A("y3"))
).toDF("col_a", "col_b", "col_c")
val df2 = df.select($"col_a.*", $"col_b", $"col_c")
df2.printSchema
// root
// |-- $numberLong: string (nullable = true)
// |-- col_b: string (nullable = true)
// |-- col_c: struct (nullable = true)
// | |-- $numberLong: string (nullable = true)
df2.select(col("$numberLong").as("test")).printSchema
// root
// |-- test: string (nullable = true)