Spark-CSV sqlContext

时间:2016-05-18 11:16:22

标签: csv apache-spark

我在spark-shell中调用以下查询。

sqlContext.sql("select cast(ts_time as varchar(10)),cast(y as varchar(10)),cast('0' as varchar(3)),case when x0 = '' then cast(null as float) else cast(x0 as float) end from tasmaxload UNION ALL
select cast(ts_time as varchar(10)),cast(y as varchar(10)),cast('1' as varchar(3)),case when x1 = '' then cast(null as float) else cast(x1 as float) end from tasmaxload").registerTempTable("testcast");

这会在某些地方抛出错误的未关闭字符串文字。

然后我设法理解,如果查询在一行中给出如下,没有错误并且执行正常。

sqlContext.sql("select cast(ts_time as varchar(10)),cast(y as varchar(10)),cast('0' as varchar(3)),case when x0 = '' then cast(null as float) else cast(x0 as float) end from tasmaxload UNION ALL select cast(ts_time as varchar(10)),cast(y as varchar(10)),cast('1' as varchar(3)),case when x1 = '' then cast(null as float) else cast(x1 as float) end from tasmaxload").registerTempTable("testcast");

然而,有没有一种方法可以管理这个问题,而不是一行呢?

我问这个问题是因为原始查询被分解成超过150行而且我无法一直将其更改为单行。

有人可以帮我解决这个问题吗?

仅供参考:我还尝试使用:粘贴模式。

提前致谢。

1 个答案:

答案 0 :(得分:0)

确定。遗憾。

此问题在论坛中为already answered

通过将单个双引号更改为三重双引号来解决此问题,如下所示,

sqlContext.sql("""select cast(ts_time as varchar(10)),cast(y as varchar(10)),cast('0' as varchar(3)),case when x0 = '' then cast(null as float) else cast(x0 as float) end from tasmaxload UNION ALL
select cast(ts_time as varchar(10)),cast(y as varchar(10)),cast('1' as varchar(3)),case when x1 = '' then cast(null as float) else cast(x1 as float) end from tasmaxload""").registerTempTable("testcast");

希望这也有助于其他人。谢谢。