我有一个超过300万条记录的表
我已经在表上运行了SQL查询,并且显示了该表的前10条记录
SQL查询:
SELECT top 10 ACCOUNTNO, VEHICLENUMBER, CUSTOMERID FROM [ISSUER].[HISTORY].[TP_CUSTOMER_PREPAIDACCOUNTS] GROUP BY ACCOUNTNO, VEHICLENUMBER, CUSTOMERID ORDER BY ACCOUNTNO
ACCOUNTNO VEHICLENUMBER CUSTOMERID
10003014 MH43AJ411 20000000
10003014 MH43AJ411 20000001
10003015 MH12GZ3392 20000002
10003016 GJ15Z8173 20000003
10003018 MH05AM902 20000004
10003019 GJ15CB727 20000008
10003019 GJ15CD7387 20029961
10003019 GJ15CD7477 20001690
10003019 GJ15CD7657 20001866
10003019 MH02DG7774 20000933
我需要设计和导出JSON文件,它应该看起来像这样:
{
"ACCOUNTNO":10003014,
"VEHICLE": [
{ "VEHICLENUMBER":"MH43AJ411", "CUSTOMERID":20000000},
{ "VEHICLENUMBER":"MH43AJ411", "CUSTOMERID":20000001}
],
"ACCOUNTNO":10003015,
"VEHICLE": [
{ "VEHICLENUMBER":"MH12GZ3392", "CUSTOMERID":20000002}
]
}
我已在我的Spark程序中运行以下代码:
jdbcDF.registerTempTable("tp_customer_account")
val res00 = sqlContext.sql("SELECT ACCOUNTNO, collect_list(struct(`VEHICLENUMBER`, `CUSTOMERID`)) as VEHICLE FROM tp_customer_account GROUP BY ACCOUNTNO ORDER BY ACCOUNTNO")
res00.coalesce(1).write.json("D:/res06")
我得到的上述代码的结果:
{"ACCOUNTNO":10003014,"VEHICLE":[{"VEHICLENUMBER":"MH43AJ411","CUSTOMERID":20000001},{"VEHICLENUMBER":"MH43AJ411","CUSTOMERID":20000001},{"VEHICLENUMBER":"MH43AJ411","CUSTOMERID":20000001},{"VEHICLENUMBER":"MH43AJ411","CUSTOMERID":20000001},{"VEHICLENUMBER":"MH43AJ411","CUSTOMERID":20000001},{"VEHICLENUMBER":"MH43AJ411","CUSTOMERID":20000001},{"VEHICLENUMBER":"MH43AJ411","CUSTOMERID":20000001},{"VEHICLENUMBER":"MH43AJ411","CUSTOMERID":20000001},{"VEHICLENUMBER":"MH43AJ411","CUSTOMERID":20000001},{"VEHICLENUMBER":"MH43AJ411","CUSTOMERID":20000001},{"VEHICLENUMBER":"MH43AJ411","CUSTOMERID":20000000}]}
{"ACCOUNTNO":10003015,"VEHICLE":[{"VEHICLENUMBER":"MH12GZ3392","CUSTOMERID":20000002}]}
{"ACCOUNTNO":10003016,"VEHICLE":[{"VEHICLENUMBER":"GJ15Z8173","CUSTOMERID":20000003},{"VEHICLENUMBER":"GJ15Z8173","CUSTOMERID":20000003},{"VEHICLENUMBER":"GJ15Z8173","CUSTOMERID":20000003},{"VEHICLENUMBER":"GJ15Z8173","CUSTOMERID":20000003},{"VEHICLENUMBER":"GJ15Z8173","CUSTOMERID":20000003}]}
{"ACCOUNTNO":10003018,"VEHICLE":[{"VEHICLENUMBER":"MH05AM902","CUSTOMERID":20000004},{"VEHICLENUMBER":"MH05AM902","CUSTOMERID":20000004},{"VEHICLENUMBER":"MH05AM902","CUSTOMERID":20000004},{"VEHICLENUMBER":"MH05AM902","CUSTOMERID":20000004},{"VEHICLENUMBER":"MH05AM902","CUSTOMERID":20000004}]}
{"ACCOUNTNO":10003019,"VEHICLE":[{"VEHICLENUMBER":"GJ15CF7747","CUSTOMERID":20009020},{"VEHICLENUMBER":"GJ15CB9601","CUSTOMERID":20001557},{"VEHICLENUMBER":"GJ15CA7837","CUSTOMERID":20001223},{"VEHICLENUMBER":"MH02DG7774","CUSTOMERID":20000933},{"VEHICLENUMBER":"GJ15CD7387","CUSTOMERID":20029961},{"VEHICLENUMBER":"GJ15CF7747","CUSTOMERID":20009020},{"VEHICLENUMBER":"GJ15CB9601","CUSTOMERID":20001557},{"VEHICLENUMBER":"MH02DG7774","CUSTOMERID":20000933},{"VEHICLENUMBER":"GJ15CD7657","CUSTOMERID":20001866},{"VEHICLENUMBER":"MH02BY7774","CUSTOMERID":20000005},{"VEHICLENUMBER":"GJ15CD7387","CUSTOMERID":20029961},{"VEHICLENUMBER":"GJ15CD7387","CUSTOMERID":20029961},{"VEHICLENUMBER":"GJ15CD7387","CUSTOMERID":20029961},{"VEHICLENUMBER":"MH02BY7774","CUSTOMERID":20000005},{"VEHICLENUMBER":"MH02BY7774","CUSTOMERID":20000005},{"VEHICLENUMBER":"MH02BY7774","CUSTOMERID":20000005},{"VEHICLENUMBER":"MH02BY7774","CUSTOMERID":20000005},{"VEHICLENUMBER":"MH02BY7774","CUSTOMERID":20000005},{"VEHICLENUMBER":"GJ15CA7387","CUSTOMERID":20001865},{"VEHICLENUMBER":"GJ15CA7387","CUSTOMERID":20001865},{"VEHICLENUMBER":"GJ15CA7387","CUSTOMERID":20001865},{"VEHICLENUMBER":"GJ15CA7387","CUSTOMERID":20001865},{"VEHICLENUMBER":"GJ15CA7387","CUSTOMERID":20001865},{"VEHICLENUMBER":"GJ15CA7387","CUSTOMERID":20001865},{"VEHICLENUMBER":"GJ15CF7747","CUSTOMERID":20009020},{"VEHICLENUMBER":"GJ15CF7747","CUSTOMERID":20009020},{"VEHICLENUMBER":"GJ15CD7657","CUSTOMERID":20001866},{"VEHICLENUMBER":"GJ15CD7657","CUSTOMERID":20001866},{"VEHICLENUMBER":"GJ15CB9601","CUSTOMERID":20001557},{"VEHICLENUMBER":"GJ15CB9601","CUSTOMERID":20001557},{"VEHICLENUMBER":"GJ15CB9601","CUSTOMERID":20001557},{"VEHICLENUMBER":"GJ15CB9601","CUSTOMERID":20001557},{"VEHICLENUMBER":"GJ15CB727","CUSTOMERID":20000008},{"VEHICLENUMBER":"GJ15CB727","CUSTOMERID":20000008},{"VEHICLENUMBER":"GJ15CB727","CUSTOMERID":20000008},{"VEHICLENUMBER":"GJ15CB727","CUSTOMERID":20000008},{"VEHICLENUMBER":"GJ15CB727","CUSTOMERID":20000008},{"VEHICLENUMBER":"GJ15CD7657","CUSTOMERID":20001866},{"VEHICLENUMBER":"GJ15CD7657","CUSTOMERID":20001866},{"VEHICLENUMBER":"GJ15CD7657","CUSTOMERID":20001866},{"VEHICLENUMBER":"GJ15CD7657","CUSTOMERID":20001866},{"VEHICLENUMBER":"GJ15CA7387","CUSTOMERID":20001865},{"VEHICLENUMBER":"GJ15CA7387","CUSTOMERID":20001865},{"VEHICLENUMBER":"GJ15CA7387","CUSTOMERID":20001865},{"VEHICLENUMBER":"GJ15CA7387","CUSTOMERID":20001865},{"VEHICLENUMBER":"GJ15CD7477","CUSTOMERID":20001690},{"VEHICLENUMBER":"GJ15CD7477","CUSTOMERID":20001690},{"VEHICLENUMBER":"GJ15CD7477","CUSTOMERID":20001690},{"VEHICLENUMBER":"GJ15CD7477","CUSTOMERID":20001690},{"VEHICLENUMBER":"GJ15CD7477","CUSTOMERID":20001690},{"VEHICLENUMBER":"MH02BY7774","CUSTOMERID":20000005},{"VEHICLENUMBER":"MH02BY7774","CUSTOMERID":20000005},{"VEHICLENUMBER":"GJ15CD7657","CUSTOMERID":20001866},{"VEHICLENUMBER":"GJ15CA7387","CUSTOMERID":20001865},{"VEHICLENUMBER":"GJ15CD7657","CUSTOMERID":20001866},{"VEHICLENUMBER":"GJ15CA7387","CUSTOMERID":20001865},{"VEHICLENUMBER":"GJ15CD7477","CUSTOMERID":20001690},{"VEHICLENUMBER":"MH02BY7774","CUSTOMERID":20000005},{"VEHICLENUMBER":"GJ15CD7477","CUSTOMERID":20001690},{"VEHICLENUMBER":"MH02BY7774","CUSTOMERID":20000005},{"VEHICLENUMBER":"MH02BY7774","CUSTOMERID":20000005},{"VEHICLENUMBER":"MH02BY7774","CUSTOMERID":20000005},{"VEHICLENUMBER":"GJ15CD7477","CUSTOMERID":20001690},{"VEHICLENUMBER":"GJ15CD7477","CUSTOMERID":20001690},{"VEHICLENUMBER":"GJ15CD7477","CUSTOMERID":20001690},{"VEHICLENUMBER":"GJ15CD7477","CUSTOMERID":20001690},{"VEHICLENUMBER":"GJ15CD7477","CUSTOMERID":20001690},{"VEHICLENUMBER":"GJ15CD7477","CUSTOMERID":20001690},{"VEHICLENUMBER":"GJ15CB9601","CUSTOMERID":20001557},{"VEHICLENUMBER":"GJ15CB9601","CUSTOMERID":20001557},{"VEHICLENUMBER":"GJ15CA7837","CUSTOMERID":20001223},{"VEHICLENUMBER":"GJ15CA7837","CUSTOMERID":20001223},{"VEHICLENUMBER":"MH02DG7774","CUSTOMERID":20000933},{"VEHICLENUMBER":"GJ15CB727","CUSTOMERID":20000008},{"VEHICLENUMBER":"MH02BY7774","CUSTOMERID":20000005}]}
{"ACCOUNTNO":10003020,"VEHICLE":[{"VEHICLENUMBER":"MH01AX5658","CUSTOMERID":20000006},{"VEHICLENUMBER":"MH01AX5658","CUSTOMERID":20000006},{"VEHICLENUMBER":"MH01AX5658","CUSTOMERID":20000006},{"VEHICLENUMBER":"MH01AX5658","CUSTOMERID":20000006},{"VEHICLENUMBER":"MH01AX5658","CUSTOMERID":20000006},{"VEHICLENUMBER":"MH01AX5658","CUSTOMERID":20000006},{"VEHICLENUMBER":"MH01AX5658","CUSTOMERID":20000006},{"VEHICLENUMBER":"MH01AX5658","CUSTOMERID":20000006}]}
{"ACCOUNTNO":10003021,"VEHICLE":[{"VEHICLENUMBER":"GJ15AD727","CUSTOMERID":20000007}]}
我们可以看到同一VEHICLENUMBER
多次出现在列表中。
如何删除列表中的这些重复值?
请帮忙!谢谢你。
在输入表中:
ACCOUNTNO
是唯一的,相同的ACCOUNTNO
可能具有 不止一个VEHICLENUMBER
,对于每辆车,我们可能会有唯一的CUSTOMERID
关于VEHICLENUMBER