如何在spark python中的一列中连接两个字符串列

时间:2016-04-23 20:53:44

标签: python apache-spark dataframe

我想将数据框中的两列连接为一列我想将nameFirst和nameLast合并为名为FULL Name的列

+---------+---------+--------+
| playerID|nameFirst|nameLast|
+---------+---------+--------+
|aardsda01|    David| Aardsma|
|aaronha01|     Hank|   Aaron|
|aaronto01|   Tommie|   Aaron|
| aasedo01|      Don|    Aase|
+---------+---------+--------+

我正在尝试这段代码:

sqlContext.sql("SELECT playerID,(nameFirst+nameLast) as full_name FROM Master")

但它返回

+---------+---------+
| playerID|full_name|
+---------+---------+
|aardsda01|     null|
|aaronha01|     null|
|aaronto01|     null|
| aasedo01|     null|

任何帮助,请

1 个答案:

答案 0 :(得分:2)

只需使用 concat 功能:

sqlContext.sql("SELECT playerID, concat(nameFirst, nameLast) as full_name FROM Master")