tx = 'a,b,c,"[""d"", ""e""]""'
file=open('temp.csv','wt')
file.writelines(tx)
file.close()
sparkSession.read.csv('temp.csv', quote='"').show()
+---+---+---+-------+---------+
|_c0|_c1|_c2| _c3| _c4|
+---+---+---+-------+---------+
| a| b| c|"[""d""| ""e""]""|
+---+---+---+-------+---------+
所需输出
的位置+---+---+---+-------------------+
|_c0|_c1|_c2| _c3 |
+---+---+---+-------------------+
| a| b| c|"[""d"", ""e""]""| |
+---+---+---+-------------------+
答案 0 :(得分:0)
我对PySpark并不熟悉,但引号似乎有问题(一个太多) - 应该是:
'a,b,c,"[""d"", ""e""]"'
然后输出应为:
+---+---+---+-------------------+
|_c0|_c1|_c2| _c3 |
+---+---+---+-------------------+
| a| b| c|["d", "e"] |
+---+---+---+-------------------+