我有一个如下所示的数据框,其中的一列包含一个已嵌套的字典列表:
import pandas as pd
data = {'First': ['First value', 'Second value'],
'Second': ['First value', 'Second value'],
'third': ['First value', 'Second value'],
'forth': ['[{"values": "","entity": "datetime","","Turn": [{"expression": "","tid": "","type": "", "value": "","mod": "","anchor": "","beginPoint": "","endPoint": ""}]}]','[{"values": "","entity": "datetime","Turn": [{"expression": "","tid": "","type": "", "value": "","mod": "","anchor": "","beginPoint": "","endPoint": ""}]}]'],
}
df = pd.DataFrame (data, columns = ['First','second','third','forth'])
我想将其转换为以下json格式并保存为:
[
{
"first": "",
"second": "",
"third": "",
"forth": [
{
"values": "",
"entity": "",
"TIMEX3": [
{
"expression": "",
"tid": "",
"type": "",
"value": "",
"mod": "",
"anchorTimeID": "",
"beginPoint": "",
"endPoint": ""
}
]
}
]
},...
我尝试了以下操作,但是输出太乱了,看起来不像我想要保存的输出
my_json = (df.groupby(['text','intent','domain'], as_index=False)
.apply(lambda x: x[['entities']].to_dict('r'))
.reset_index()
.to_json(orient='records',indent= 2))
答案 0 :(得分:1)
我相信,您离想要的格式不远。唯一的问题是列var isEmpty = true;
for (var item in obj) {
if (obj[item] !== 0) {
isEmpty = false;
}
}
// now isEmpty reflects the state of all the object's arrays being empty
包含字典作为字符串。一种可能的方法是将所有内容都转换回字典,使用eval将字符串转换回字典,并使用json解析器将其很好地打印出来:
forth
有两个小更正:import pandas as pd
import json
data = {'First': ['First value', 'Second value'],
'Second': ['First value', 'Second value'],
'third': ['First value', 'Second value'],
'forth': ['[{"values": "","entity": "datetime","Turn": [{"expression": "","tid": "","type": "", "value": "","mod": "","anchor": "","beginPoint": "","endPoint": ""}]}]','[{"values": "","entity": "datetime","Turn": [{"expression": "","tid": "","type": "", "value": "","mod": "","anchor": "","beginPoint": "","endPoint": ""}]}]'],
}
df = pd.DataFrame (data, columns = ['First','Second','third','forth'])
my_dict = df.to_dict(orient='records')
for row in my_dict:
row['forth'] = eval(row['forth'])
my_json = json.dumps(my_dict, indent=2)
print(my_json)
键上的大写字母和无效的输入:Second
键上的, "",
。
这是我的输出的副本:
forth
如果列[
{
"First": "First value",
"Second": "First value",
"third": "First value",
"forth": [
{
"values": "",
"entity": "datetime",
"Turn": [
{
"expression": "",
"tid": "",
"type": "",
"value": "",
"mod": "",
"anchor": "",
"beginPoint": "",
"endPoint": ""
}
]
}
]
}, ...
已经是数据框中的字典,则可以直接调用forth
,而格式将是您所需要的。例如,您可以尝试将校正后的to_json
转换回数据帧:
my_dict