Question

我在Json数据框中有数据，如下所示。

{"nm": 1233, "date": "2017-01-23", "name": [],"id": "9253194"}
{"nm": 1234, "date": "2017-01-23", "name": [],"id": "9253196"}
{"nm": 1235, "date": "2017-01-23", "name": [],"id": "9253195"}

如何添加带索引列的新行以插入scala中的弹性搜索。

{"create": {"_type": "usd", "_id": "92531964", "_index": "amount"}}
{"nm": 1233, "date": "2017-01-23", "name": [],"id": "9253194"}
{"create": {"_type": "usd", "_id": "92531966", "_index": "amount"}}
{"nm": 1234, "date": "2017-01-23", "name": [],"id": "9253196"}
{"create": {"_type": "usd", "_id": "92531965", "_index": "amount"}}
{"nm": 1235, "date": "2017-01-23", "name": [],"id": "9253195"}

这里_id我从现有的列和_type派生，_index是常量。

Answer 1

使用flatMap：

input.flatMap { x => Seq(x, transform(x)) }

由于这些记录具有不同的架构，因此您可能只需将它们作为字符串输出。

基于现有行在Scala-Spark数据框中添加新行

1 个答案: