Question

我的spark数据框具有一列，如下所示

"drop":{"dropPath":"https://dropserv.content25.ec2.st-av.net/drop?source_id: string (nullable = true)

我需要对此运行选择查询，我尝试了以下命令，但出现错误

df.select('"drop":{"dropPath":"https://dropserv.content25.ec2.st-av.net/drop?source_id').show(10)     

error: unclosed character literal

我的数据框架构为

scala> df.printSchema（）

root
 |-- metadata: struct (nullable = true)
 |    |-- "drop":{"dropPath":"https://dropserv.content25.ec2.st-av.net/drop?source_id: string (nullable = true)
 |-- url: string (nullable = true)

我也尝试了下面的事情，但同样的错误

  df.select(('`"drop":{"dropPath":"https://mediaserv.media27.ec2.st-av.net/drop?source_id`').show()

Answer 1

您可以为此使用`。

df.select('drop.`dropPath`.`https://dropserv.content25.ec2.st-av.net/drop?source_id`').show(10)

Answer 2

好的，所以问题出在列名中的点.上。删除所有点后，您会发现它工作正常。

您可以选择列名称的方法是

#Add ` in the start and end of string while selecting.
df.select('`"drop":{"dropPath":"https://dropserv.content25.ec2.st-av.net/drop?source_id`')

带有双引号的spark选择列

2 个答案: