Question

我有一个列，它在spark表中的类型为// @connect api.stackexchange.com。我正在使用SQL来查询这些火花表。我想将array < string >转换为array < string >。

使用以下语法时：

string

select cast(rate_plan_code as string) as new_rate_plan from customer_activity_searches group by rate_plan_code列包含以下值：

rate_plan_code

在["AAA","RACK","SMOBIX","SMOBPX"] ["LPCT","RACK"] ["LFTIN","RACK","SMOBIX","SMOBPX"] ["LTGD","RACK"] ["RACK","LEARLI","NHDP","LADV","LADV2"]列中填充了以下内容：

new_rate_plan

当我将

org.apache.spark.sql.catalyst.expressions.UnsafeArrayData@e4273d9f
org.apache.spark.sql.catalyst.expressions.UnsafeArrayData@c1ade2ff
org.apache.spark.sql.catalyst.expressions.UnsafeArrayData@4f378397
org.apache.spark.sql.catalyst.expressions.UnsafeArrayData@d1c81377
org.apache.spark.sql.catalyst.expressions.UnsafeArrayData@552f3317

转换为decimal或int转换为int时，

Cast似乎有效，但在这种情况下则不行。好奇为什么演员不在这里工作。非常感谢您的帮助。

Answer 1

在Spark 2.1+中，要对单个Array列中的值进行串联，可以使用以下命令：

6标准功能
concat_ws运营商
用户定义的函数（UDF）

concat_ws标准函数

使用concat_ws功能。

concat_ws（sep：String，exprs：Column *）：Column 使用给定的分隔符将多个输入字符串列连接成一个字符串列。

map

地图运算符

使用map运算符可以完全控制应该转换的内容和方式。

map [U]（func：（T）⇒U）：数据集[U] 返回一个新的数据集，其中包含将func应用于每个元素的结果。

val solution = words.withColumn("codes", concat_ws(" ", $"rate_plan_code"))
scala> solution.show
+--------------+-----------+
|         words|      codes|
+--------------+-----------+
|[hello, world]|hello world|
+--------------+-----------+

Answer 2

您可以在创建此df 而不是在输出

处将数组转换为字符串

newdf = df.groupBy('aaa')
  .agg(F.collect_list('bbb').("string").alias('ccc'))

outputdf = newdf.select(
  F.concat_ws(', ' , newdf.aaa, F.format_string('xxxxx(%s)', newdf.ccc)))

Answer 3

在spark 2.1+中，您可以直接使用concat_ws转换（与分隔符连接）字符串/数组＆lt;字符串＆gt;成为String。

select concat_ws(',',rate_plan_code) as new_rate_plan  from
customer_activity_searches group by rate_plan_code

这会给你一些回复：

AAA,RACK,SMOBIX,SMOBPX 
LPCT,RACK
LFTIN,RACK,SMOBIX,SMOBPX
LTGD,RACK 
RACK,LEARLI,NHDP,LADV,LADV2

PS：concat_ws不适用于like array＆lt;长＆gt; ...，对于哪个UDF或地图将是Jacek告诉的唯一选择。

Answer 4

在SQL中执行所需操作的方法是使用内置的sql函数string()

select string(rate_plan_code) as new_rate_plan  from
customer_activity_searches group by rate_plan_code

如何将字符串数组的列转换为字符串？

4 个答案:

concat_ws标准函数

地图运算符