我有一个相当大的.json格式的数据文件,我想对其进行处理,其格式如下,就像许多json对象在一起:
[
{
"_id" : "...",
"idSession" : "...",
"createdAt" : "1526894989268",
"status" : "COMPLETE",
"raw" : "Bobsguide,Marketing Assistant,Sales / Marketing79642,Baitshepi,,etc",
"updatedAt" : "...",
"graphResults" : [
[
"lastName",
"stock"
],
[
"country",
"Botswana"
],
[
"location",
"Botswana "
],
[
"city",
"-"
],
[
"state",
"-"
],
[
"school",
"Heriot-Watt University"
],
[
"skills",
"Budgeting,Business Process Improvement,Business Planning"
],
],
"eid" : {
"###" : "12020653-1889-35be-8009-b1c9d43768ac"
}
}
{
"_id" : "...",
"idSession" : "...",
"createdAt" : "1526894989268",
"status" : "COMPLETE",
"raw" : "Bobsguide,79619,Steven,example,steven.jones@example.com,Marketing Assistant,Sales,,etc",
"updatedAt" : "...",
"graphResults" : [
[
"country",
"United Kingdom"
],
[
"location",
"United Kingdom London London"
],
[
"city",
"London"
],
[
"state",
"London"
],
[
"skills",
"Solvency II,Liquidity Risk,Screening,etc"
]
],
"eid" : {
"###" : "..."
}
}
...
]
有没有一种直接的方法可以将其读入python脚本进行操作/分析。感兴趣的主要部分在图形结果和原始标签下。我没有这种原始数据形式的经验,因此非常感谢您的帮助。
答案 0 :(得分:0)
首先,您发布的数据不正确,应该类似于以下内容,并且要访问您提到的元素,可以尝试以下内容
{
"test":[
{
"_id" : "...",
"idSession" : "...",
"createdAt" : "1526894989268",
"status" : "COMPLETE",
"raw" : "Bobsguide,Marketing Assistant,Sales / Marketing79642,Baitshepi,,etc",
"updatedAt" : "...",
"graphResults" : [
[
"lastName",
"stock"
],
[
"country",
"Botswana"
],
[
"location",
"Botswana "
],
[
"city",
"-"
],
[
"state",
"-"
],
[
"school",
"Heriot-Watt University"
],
[
"skills",
"Budgeting,Business Process Improvement,Business Planning"
]
],
"eid" : {
"###" : "12020653-1889-35be-8009-b1c9d43768ac"
}
},
{
"_id" : "...",
"idSession" : "...",
"createdAt" : "1526894989268",
"status" : "COMPLETE",
"raw" : "Bobsguide,79619,Steven,example,steven.jones@example.com,Marketing Assistant,Sales,,etc",
"updatedAt" : "...",
"graphResults" : [
[
"country",
"United Kingdom"
],
[
"location",
"United Kingdom London London"
],
[
"city",
"London"
],
[
"state",
"London"
],
[
"skills",
"Solvency II,Liquidity Risk,Screening,etc"
]
],
"eid" : {
"###" : "..."
}
}
]
}
//答案
import json
data_file = open('data.json', 'r')
information = json.load(data_file) // this will give you a json obj
print(information['test'][1]['raw']) // would pick element 1 from array then
在原始键中选择并打印值
print(information['test'][1]['graphResults']) // would pick element 1 from array then pick and print value in raw key