在Spark Scala中找不到错误值

时间:2018-08-13 23:02:17

标签: scala apache-spark

模式:

root
 |-- col_a: struct (nullable = true)
 |    |-- $numberLong: string (nullable = true)
 |-- col_b: string (nullable = true)
 |-- col_c: struct (nullable = true)
 |    |-- $numberLong: string (nullable = true)

破坏(col_a)结构的代码

df = df.select($"col_a.*",$"col_b",$"col_c")
df.printSchema()

操作:

|-- $numberLong: string (nullable = true)
|-- col_b: string (nullable = true)
|-- col_c: struct (nullable = true)
|    |-- $numberLong: string (nullable = true)

现在,当我尝试仅选择第一列(“ $ numberLong”)并将其重命名时

df = df.select($"$numberLong".as("test"))

我遇到以下错误:

error: not found: value numberLong
df = df.select($"$numberLong")
                  ^

当该列明显存在时,我无法理解错误的原因。

1 个答案:

答案 0 :(得分:0)

如果一列的列名前有$,则不能用$"colName"引用该列-即使您用backticks括起了colName也不行。而是使用col("colName"),如下所示:

case class A(`$numberLong`: String)

val df = Seq(
  (A("x1"), "d1", A("y1")),
  (A("x2"), "d2", A("y2")),
  (A("x3"), "d3", A("y3"))
).toDF("col_a", "col_b", "col_c")

val df2 = df.select($"col_a.*", $"col_b", $"col_c")

df2.printSchema
// root
//  |-- $numberLong: string (nullable = true)
//  |-- col_b: string (nullable = true)
//  |-- col_c: struct (nullable = true)
//  |    |-- $numberLong: string (nullable = true)

df2.select(col("$numberLong").as("test")).printSchema
// root
//  |-- test: string (nullable = true)