我很想知道如何将这个JSON文件读入Pandas数据帧并设置新的标头,因为我的源没有任何标头。我试图获取日期,街道,郊区作为标题。
作为一个例子。 肯特街是郊区, Karawara 是郊区
{
"25 March 2019": {
"Albany Highway": ["Maddington", "Cannington"],
"Kent Street": ["Karawara"],
"Kitchener Road": ["Alfred Cove"],
"Alexander Road": ["Rivervale"],
"Kwinana Freeway": ["Wellard"],
},
"26 March 2019": {
"Great Eastern Highway": ["Sawyers Valley", "Redcliffe"],
"South Western Highway": ["Armadale", "Wungong"],
"Great Northern Highway": ["Muchea", "Baskerville"],
"St Thomas Primary": ["Claremont"],
"Stirling Highway": ["Claremont"],
"Grovelands Primary": ["Camillo"],
"Swan View Senior High": ["Swan View"],
}
}
期望输出类似于;
{
{
"date": "25 March 2019",
"street": "Kent Street"
"suburb": "Karawara"
}, {
"date": "26 March 2019",
"street": "St Thomas Primary"
"suburb": "Claremont"
}
}
规则 第一值始终是街道。 第二值为郊区。有些情况下有两个郊区。从概念上讲,我们将有两排,但如果不是我的,则将其保留为一排。
我发现了类似Pandas read nested json之类的问题,但是找不到任何示例,其中json文件仅具有零标头。
答案 0 :(得分:3)
如果我正确理解,您需要以下内容:
首先,读取Json文件并将其转换为Dictionary
import json
with open('<yourFile>.json', 'r') as JSON:
json_dict = json.load(JSON)
然后,我想你有这个:
x={
"25 March 2019": {
"Albany Highway": ["Maddington", "Cannington"],
"Kent Street": ["Karawara"],
"Kitchener Road": ["Alfred Cove"],
"Alexander Road": ["Rivervale"],
"Kwinana Freeway": ["Wellard"],
},
"26 March 2019": {
"Great Eastern Highway": ["Sawyers Valley", "Redcliffe"],
"South Western Highway": ["Armadale", "Wungong"],
"Great Northern Highway": ["Muchea", "Baskerville"],
"St Thomas Primary": ["Claremont"],
"Stirling Highway": ["Claremont"],
"Grovelands Primary": ["Camillo"],
"Swan View Senior High": ["Swan View"],
}
}
您可以这样做:
df=pd.DataFrame([(j,z,h) for i in x.values() for j in x.keys() for h,z in i.items()],columns=['Date','suburb','street'])
print(df)
Date suburb street
0 25 March 2019 [Maddington, Cannington] Albany Highway
1 25 March 2019 [Karawara] Kent Street
2 25 March 2019 [Alfred Cove] Kitchener Road
3 25 March 2019 [Rivervale] Alexander Road
4 25 March 2019 [Wellard] Kwinana Freeway
5 26 March 2019 [Maddington, Cannington] Albany Highway
6 26 March 2019 [Karawara] Kent Street
7 26 March 2019 [Alfred Cove] Kitchener Road
8 26 March 2019 [Rivervale] Alexander Road
9 26 March 2019 [Wellard] Kwinana Freeway
10 25 March 2019 [Sawyers Valley, Redcliffe] Great Eastern Highway
11 25 March 2019 [Armadale, Wungong] South Western Highway
12 25 March 2019 [Muchea, Baskerville] Great Northern Highway
13 25 March 2019 [Claremont] St Thomas Primary
14 25 March 2019 [Claremont] Stirling Highway
15 25 March 2019 [Camillo] Grovelands Primary
16 25 March 2019 [Swan View] Swan View Senior High
17 26 March 2019 [Sawyers Valley, Redcliffe] Great Eastern Highway
18 26 March 2019 [Armadale, Wungong] South Western Highway
19 26 March 2019 [Muchea, Baskerville] Great Northern Highway
20 26 March 2019 [Claremont] St Thomas Primary
21 26 March 2019 [Claremont] Stirling Highway
22 26 March 2019 [Camillo] Grovelands Primary
23 26 March 2019 [Swan View] Swan View Senior High
或者,您可以这样做:
dic=[{'date':j,'street':z,'suburb':h} for i in x.values() for j in x.keys() for h,z in i.items()]
dic
[{'date': '25 March 2019',
'street': ['Maddington', 'Cannington'],
'suburb': 'Albany Highway'},
{'date': '25 March 2019', 'street': ['Karawara'], 'suburb': 'Kent Street'},
{'date': '25 March 2019',
'street': ['Alfred Cove'],
'suburb': 'Kitchener Road'},
{'date': '25 March 2019',
'street': ['Rivervale'],
'suburb': 'Alexander Road'},
{'date': '25 March 2019', 'street': ['Wellard'], 'suburb': 'Kwinana Freeway'},
{'date': '26 March 2019',
'street': ['Maddington', 'Cannington'],
'suburb': 'Albany Highway'},
{'date': '26 March 2019', 'street': ['Karawara'], 'suburb': 'Kent Street'},
{'date': '26 March 2019',
'street': ['Alfred Cove'],
'suburb': 'Kitchener Road'},
{'date': '26 March 2019',
'street': ['Rivervale'],
'suburb': 'Alexander Road'}
...
作为字典列表。现在,您可以像这样将其转换为数据框:
df=pd.DataFrame(d)
date street suburb
0 25 March 2019 [Maddington, Cannington] Albany Highway
1 25 March 2019 [Karawara] Kent Street
2 25 March 2019 [Alfred Cove] Kitchener Road
3 25 March 2019 [Rivervale] Alexander Road
4 25 March 2019 [Wellard] Kwinana Freeway
5 26 March 2019 [Maddington, Cannington] Albany Highway
6 26 March 2019 [Karawara] Kent Street
7 26 March 2019 [Alfred Cove] Kitchener Road
8 26 March 2019 [Rivervale] Alexander Road
9 26 March 2019 [Wellard] Kwinana Freeway
10 25 March 2019 [Sawyers Valley, Redcliffe] Great Eastern Highway
11 25 March 2019 [Armadale, Wungong] South Western Highway
12 25 March 2019 [Muchea, Baskerville] Great Northern Highway
13 25 March 2019 [Claremont] St Thomas Primary
14 25 March 2019 [Claremont] Stirling Highway
15 25 March 2019 [Camillo] Grovelands Primary
16 25 March 2019 [Swan View] Swan View Senior High
17 26 March 2019 [Sawyers Valley, Redcliffe] Great Eastern Highway
18 26 March 2019 [Armadale, Wungong] South Western Highway
19 26 March 2019 [Muchea, Baskerville] Great Northern Highway
20 26 March 2019 [Claremont] St Thomas Primary
21 26 March 2019 [Claremont] Stirling Highway
22 26 March 2019 [Camillo] Grovelands Primary
23 26 March 2019 [Swan View] Swan View Senior High