have_df = pd.DataFrame({'User':['101','101','101'],'json_text':["""{"president":{"name": "Zaphod Beeblebrox","species": "Betelgeusian"}}""","""{"president":{"name": "Zaphod Beeblebrox","species": "Betelgeusian"}}""",'blank']})
,我想要此导出管道分隔文件并尝试以下操作:
have_df.to_csv('have_df.csv',sep="|")
当我打开并查看管道文件时,在json文本值周围有一个额外的双引号:
"{""president"":{""name"": ""Zaphod Beeblebrox"",""species"": ""Betelgeusian""}}"
如何删除这种多余的双引号编程方式?谢谢
答案 0 :(得分:1)
您有json字符串使用json.loads
进行了转换,应该可以解决您的问题
例如:
import pandas as pd
import json
def converttojson(val):
try:
return json.loads(val)
except:
return val
have_df = pd.DataFrame({'User':['101','101','101'],'json_text':["""{"president":{"name": "Zaphod Beeblebrox","species": "Betelgeusian"}}""","""{"president":{"name": "Zaphod Beeblebrox","species": "Betelgeusian"}}""",'blank']})
have_df["json_text"] = have_df["json_text"].apply(converttojson)
have_df.to_csv(filename,sep="|")
have_df["json_text"].apply(json.loads)
。但是我使用converttojson
是因为您的样本数据中有blank
答案 1 :(得分:1)
首先评估您的JSON数据,然后保存到csv:
(have_df.json_text
.replace('blank', "None")
.apply(ast.literal_eval)
.to_csv('file.csv', sep='|')
)
file.csv
0|{'president': {'name': 'Zaphod Beeblebrox', 'species': 'Betelgeusian'}}
1|{'president': {'name': 'Zaphod Beeblebrox', 'species': 'Betelgeusian'}}
2|