Python-使用相同的键提取多个json

时间:2019-07-12 13:48:51

标签: python json

所以我正在加载一个包含代理的JSON文件

我的JSON对象。

 {  
   "http":{  
      "http://":"64.90.50.38:45876/",
      "http://":"89.250.220.40:54687/",
      "http://":"89.207.92.146:37766/",
      "http://":"89.23.194.174:8080/",
      "http://":"82.208.111.100:52480/"
   }
}

我想访问每个代理,但是我一直在获取最后一个代理 “ http://”:“ 82.208.111.100:52480 /

我的代码:

import json
x = open('proxy.json', 'r')
data = json.load(x)
print data['http']

我的问题是: 如何使用相同的键访问这些值?

4 个答案:

答案 0 :(得分:1)

Python词典不能有重复的键。如果字典的定义中有重复的键,则将使用最后一对re # X1 X2 X3 #[1,] 1.2629543 1.5271869 -0.6728037 #[2,] -0.3262334 -1.5980185 1.1321869 #[3,] 1.3011143 -1.4371186 1.4062888 #[4,] -0.5871490 -0.6752118 0.6309875 #[5,] 0.7013886 -1.1291842 -1.9288270

key:value将JSON对象转换为Python字典。

答案 1 :(得分:1)

通常的方法是拥有一个支持保存多个值的数据结构。例如

{  
   "http": [
      "64.90.50.38:45876",
      "89.250.220.40:54687",
      "89.207.92.146:37766",
      "89.23.194.174:8080",
      "82.208.111.100:52480"
   ]
}

您的代码将随后打印 ["64.90.50.38:45876", "89.250.220.40:54687", ...]

例如,

Django具有MultiValueDict,您可以将其加载到数据中以提供更好的API。来源here

答案 2 :(得分:1)

按照documentation

RFC规定JSON对象中的名称应唯一,但不强制要求如何处理JSON对象中的重复名称。默认情况下,此模块不会引发异常;相反,它会忽略给定名称的除了姓氏/值对之外的所有变量:

>>> weird_json = '{"x": 1, "x": 2, "x": 3}'
>>> json.loads(weird_json)
{'x': 3}

object_pairs_hook参数可用于更改此行为。

Documentation还声明object_pairs_hook是一个可选函数,将使用对对象有序对解码的对象文字的结果调用该函数。将使用object_pairs_hook的返回值代替dict。此功能可用于实现自定义解码器。如果还定义了object_hook,则object_pairs_hook优先。

例如

Python 3.7.4 (tags/v3.7.4:e09359112e, Jul  8 2019, 19:29:22) [MSC v.1916 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import json
>>> from collections import defaultdict
>>> from pprint import pprint
>>>
>>> s = """
... {
...    "http":{
...       "http://":"64.90.50.38:45876/",
...       "http://":"89.250.220.40:54687/",
...       "http://":"89.207.92.146:37766/",
...       "http://":"89.23.194.174:8080/",
...       "http://":"82.208.111.100:52480/"
...    }
... }
... """
>>>
>>> def custom_hook(obj):
...     # Identify dictionary with duplicate keys...
...     # If found create a separate dict with single key and val and as list.
...     if len(obj) > 1 and len(set(i for i, j in obj)) == 1:
...         data_dict = defaultdict(list)
...         for i, j in obj:
...             data_dict[i].append(j)
...         return dict(data_dict)
...     return dict(obj)
...
>>> data = json.loads(s, object_pairs_hook=custom_hook)
>>> pprint(data)
{'http': {'http://': ['64.90.50.38:45876/',
                      '89.250.220.40:54687/',
                      '89.207.92.146:37766/',
                      '89.23.194.174:8080/',
                      '82.208.111.100:52480/']}}
>>>
>>> pprint(data['http'])
{'http://': ['64.90.50.38:45876/',
             '89.250.220.40:54687/',
             '89.207.92.146:37766/',
             '89.23.194.174:8080/',
             '82.208.111.100:52480/']}

答案 3 :(得分:0)

太慢了,但是如果您可以更改JSON,为什么不将其设为列表?

{  
   "http": [
       "64.90.50.38:45876",
       "89.250.220.40:54687",
       "89.207.92.146:37766",
       "89.23.194.174:8080",
       "82.208.111.100:52480"
   ]
}

然后您可以按以下方式解析数据:

with open('proxy.json', 'r) as json_data:
    proxy_data = json.load(json_data)

并访问它:

proxy_data['http'][0]

>>> 'http://64.90.50.38:45876/'