显示源火花数据帧中目标火花数据帧的变化

时间:2017-08-23 05:20:36

标签: apache-spark dataframe formatting

请帮我解决以下问题

source spark dataframe - :

+---+-------+-------+
|key|   city|flag   |
+---+-------+-------+
|  1|  Noida|active |
|  2|Gurgaon|active |
|  3|  Delhi|active |
+---+-------+-------+

目标火花数据框 - :

+---+-------+------------+-------------+
|key|   city|sarogate_key|flag         |
+---+-------+------------+-------------+
|  1| Noida|           1|active    |
|  3|  Delhi|           3|active       |
|  2|Gurgaon|           2|active       |
+---+-------+------------+-------------+

在源火花数据框中,密钥1城市从诺伊达变为孟买

+---+-------+-------+
|key|   city|flag   |
+---+-------+-------+
|  1|  Mumbai|active |
|  2|Gurgaon|active |
|  3|  Delhi|active |
+---+-------+-------+

现在我们需要将目标火花数据框更改为以下格式并保留历史记录

+---+-------+------------+-------------+
|key|   city|sarogate_key|flag         |
+---+-------+------------+-------------+
|  1| Noida|            1|de-active    |
|  3|  Delhi|           3|active       |
|  2|Gurgaon|           2|active       |
+---+-------+------------+-------------+
| 4 | Mumbai |          4| active      |
+---+-------+------------+-------------+

0 个答案:

没有答案