如何自动在spark scala中创建主键?

时间:2017-11-21 08:52:58

标签: scala apache-spark

|Country|       DOB|     app_name|             contact|               email|              friend|                name|               phone|                 UID|
+-------+----------+-------------+--------------------+--------------------+--------------------+--------------------+--------------------+--------------------+
|  India|25/03/1995|        [IMT]|[[India,123,Wrapp...|[romaahire11@gmai...|[Kashmira G, Bhar...|[Roma Ahire, Roma A]|[9028512546, 7276...|059fe797-c296-46d...|
|  India|22/05/1978|[IMT, ozmott]|[[India,595,Wrapp...|[azeem@yahoo.com,...|[Prjakta W, Praja...|[Azeem, Azeem Seikh]|        [9785213564]|454bc185-5de0-427...|
|  India|22/05/1978|[IMT, ozmott]|[[USA,789,Wrapped...|[praj@yahoo.com, ...|[Gouri Abhyankar,...|[Prajakta W, Praj...|        [9785213564]|91897109-9fd2-4f3...|
+-------+----------+-------------+--------------------+--------------------+--------------------+--------------------+--------------------+-----------

上面是我的registerTempTable

1 个答案:

答案 0 :(得分:0)

您可以使用monotonically increasing id为每行提供唯一ID。

import org.apache.spark.sql.functions._ 

df.withColumn("Keys",monotonicallyIncreasingId)