我正在尝试使用pandas读取JSON文件。
import pandas as pd
df = pd.read_json('https://data.gov.in/node/305681/datastore/export/json')
我得到了valueError。
ValueError: arrays must all be same length
其他一些JSON页面显示此错误:
ValueError: Mixing dicts with non-Series may lead to ambiguous ordering.
我如何以某种方式读取值?我并不特别关注数据有效性。
答案 0 :(得分:9)
查看json它是有效的,但它嵌套了数据和字段:
import json
import requests
In [11]: d = json.loads(requests.get('https://data.gov.in/node/305681/datastore/export/json').text)
In [12]: list(d.keys())
Out[12]: ['data', 'fields']
您希望将数据作为内容,将字段作为列名称:
In [13]: pd.DataFrame(d["data"], columns=[x["label"] for x in d["fields"]])
Out[13]:
S. No. States/UTs 2008-09 2009-10 2010-11 2011-12 2012-13
0 1 Andhra Pradesh 183446.36 193958.45 201277.09 212103.27 222973.83
1 2 Arunachal Pradesh 360.5 380.15 407.42 419 438.69
2 3 Assam 4658.93 4671.22 4707.31 4705 4709.58
3 4 Bihar 10740.43 11001.77 7446.08 7552 8371.86
4 5 Chhattisgarh 9737.92 10520.01 12454.34 12984.44 13704.06
5 6 Goa 148.61 148 149 149.45 457.87
6 7 Gujarat 12675.35 12761.98 13269.23 14269.19 14558.39
7 8 Haryana 38149.81 38453.06 39644.17 41141.91 42342.66
8 9 Himachal Pradesh 977.3 1000.26 1020.62 1049.66 1069.39
9 10 Jammu and Kashmir 7208.26 7242.01 7725.19 6519.8 6715.41
10 11 Jharkhand 3994.77 3924.73 4153.16 4313.22 4238.95
11 12 Karnataka 23687.61 29094.3 30674.18 34698.77 36773.33
12 13 Kerala 15094.54 16329.52 16856.02 17048.89 22375.28
13 14 Madhya Pradesh 6712.6 7075.48 7577.23 7971.53 8710.78
14 15 Maharashtra 35502.28 38640.12 42245.1 43860.99 45661.07
15 16 Manipur 1105.25 1119 1137.05 1149.17 1162.19
16 17 Meghalaya 994.52 999.47 1010.77 1021.14 1028.18
17 18 Mizoram 411.14 370.92 387.32 349.33 352.02
18 19 Nagaland 831.92 833.5 802.03 703.65 617.98
19 20 Odisha 19940.15 23193.01 23570.78 23006.87 23229.84
20 21 Punjab 36789.7 32828.13 35449.01 36030 37911.01
21 22 Rajasthan 6449.17 6713.38 6696.92 9605.43 10334.9
22 23 Sikkim 136.51 136.07 139.83 146.24 146
23 24 Tamil Nadu 88097.59 108475.73 115137.14 118518.45 119333.55
24 25 Tripura 1388.41 1442.39 1569.45 1650 1565.17
25 26 Uttar Pradesh 10139.8 10596.17 10990.72 16075.42 17073.67
26 27 Uttarakhand 1961.81 2535.77 2613.81 2711.96 3079.14
27 28 West Bengal 33055.7 36977.96 39939.32 43432.71 47114.91
28 29 Andaman and Nicobar Islands 617.58 657.44 671.78 780 741.32
29 30 Chandigarh 272.88 248.53 180.06 180.56 170.27
30 31 Dadra and Nagar Haveli 70.66 70.71 70.28 73 73
31 32 Daman and Diu 18.83 18.9 18.81 19.67 20
32 33 Delhi 1.17 1.17 1.17 1.23 NA
33 34 Lakshadweep 134.64 138.22 137.98 139.86 139.99
34 35 Puducherry 111.69 112.84 113.53 116 112.89
有关更复杂的json DataFrame提取,请参阅json_normalize
。
答案 1 :(得分:2)
以下列出了我的键和值对:
from urllib.request import urlopen
import json
from pandas.io.json import json_normalize
import pandas as pd
import requests
df = json.loads(requests.get('https://api.github.com/repos/akkhil2012/MachineLearning').text)
data = pd.DataFrame.from_dict(df, orient='index')
print(data)
答案 2 :(得分:1)
EHT 对于这种情况,我们可以通过执行
来创建数据帧import pandas as pd
df = pd.DataFrame(data["data"])