用python解析特殊的json格式

时间:2015-02-15 00:21:19

标签: python json

我想获得GPSLatitude和GPSLongitude值,但我不能使用python位置因为位置非常随机。我按标签的值获取值,我该怎么做?

jsonFlickrApi({ "photo": { "id": "8566959299", "secret": "141af38562", "server": "8233", "farm": 9, "camera": "Apple iPhone 4S", 
    "exif": [
      { "tagspace": "JFIF", "tagspaceid": 0, "tag": "JFIFVersion", "label": "JFIFVersion", 
        "raw": { "_content": 1.01 } },
      { "tagspace": "JFIF", "tagspaceid": 0, "tag": "ResolutionUnit", "label": "Resolution Unit", 
        "raw": { "_content": "inches" } },
      { "tagspace": "JFIF", "tagspaceid": 0, "tag": "XResolution", "label": "X-Resolution", 
        "raw": { "_content": 72 }, 
        "clean": { "_content": "72 dpi" } },
      { "tagspace": "JFIF", "tagspaceid": 0, "tag": "YResolution", "label": "Y-Resolution", 
        "raw": { "_content": 72 }, 
        "clean": { "_content": "72 dpi" } },
      { "tagspace": "GPS", "tagspaceid": 0, "tag": "GPSLatitudeRef", "label": "GPS Latitude Ref", 
        "raw": { "_content": "North" } },
      { "tagspace": "GPS", "tagspaceid": 0, "tag": "GPSLatitude", "label": "GPS Latitude", 
        "raw": { "_content": "39 deg 56' 44.40\"" }, 
        "clean": { "_content": "39 deg 56' 44.40\" N" } },
      { "tagspace": "GPS", "tagspaceid": 0, "tag": "GPSLongitudeRef", "label": "GPS Longitude Ref", 
        "raw": { "_content": "East" } },
      { "tagspace": "GPS", "tagspaceid": 0, "tag": "GPSLongitude", "label": "GPS Longitude", 
        "raw": { "_content": "116 deg 16' 10.20\"" }, 
        "clean": { "_content": "116 deg 16' 10.20\" E" } },
    ] }, "stat": "ok" })

3 个答案:

答案 0 :(得分:1)

您没有说您是否使用其中一个Flickr API;我假设不是因为如果您使用flickrapi之类的API,处理JSON响应是微不足道的。

import flickrapi

api_key = '88341066e8f0a40516599d28d8170627'   # from flickr's API explorer
secret = 'sssshhhh'
flickr = flickrapi.FlickrAPI(api_key, secret, format='parsed-json')
response = flickr.photos.getExif(photo_id='8566959299')
lat_long = {exif['tag']: exif['clean']['_content']
                    for exif in response['photo']['exif']
                        if exif['tag'] in (u'GPSLongitude', u'GPSLatitude')}

>>> from pprint import pprint
>>> pprint(lat_long)
{u'GPSLatitude': u'39 deg 56\' 44.40" N',
 u'GPSLongitude': u'116 deg 16\' 10.20" E'}

但继续假设您没有使用API​​,您看到的响应格式实际上是JSONP,它更适合Javascript而不是Python。但是,您可以在JSON表示中请求没有封闭jsonFlickrApi()函数包装器的响应。通过在请求的查询参数中指定format=json&nojsoncallback=1来执行此操作。使用requests库可以轻松地请求和解析JSON响应,但如果您不能使用urllib2.urlopen(),那么这与json.loads()结合requests的效果相同,例如< / p>

import requests

params = {'api_key': '88341066e8f0a40516599d28d8170627',
          'api_sig': '7b2dcfb2cd3a747179c2ed0fdc492699',
          'format': 'json',
          'method': 'flickr.photos.getExif',
          'nojsoncallback': '1',
          'photo_id': '8566959299',
          'secret': 'sssshhhh'}    
response = requests.get('https://api.flickr.com/services/rest/', params=params)
data = response.json()
lat_long = {exif['tag']: exif['clean']['_content']
                for exif in data['photo']['exif']
                    if exif['tag'] in (u'GPSLongitude', u'GPSLatitude')}

>>> from pprint import pprint
>>> pprint(lat_long)
{u'GPSLatitude': u'39 deg 56\' 44.40" N',
 u'GPSLongitude': u'116 deg 16\' 10.20" E'}

答案 1 :(得分:0)

如果将整个字符串视为jsonFlickrApi(XXX)XXX是标准的JSON字符串。使用json库,XXX可以转换为python字典,然后轻松解析。

答案 2 :(得分:0)

除了在右括号]之前的最后一个逗号之外,FlickrAPI返回的整个对象是valid json

假设该逗号只是一个复制粘贴错误(example evidence建议就是这种情况),那么内置json module仍然无法按原样使用。这是因为即使像"116 deg 16' 10.20\" E"之类的字符串是valid json,python的json模块也会抱怨ValueError,因为双引号"没有被充分引用:

>>> import json
>>> json.loads('{"a": "2"}')
{u'a': u'2'}
>>> json.loads('{"a": "2\""}')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python2.7/json/decoder.py", line 365, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python2.7/json/decoder.py", line 381, in raw_decode
    obj, end = self.scan_once(s, idx)
ValueError: Expecting , delimiter: line 1 column 10 (char 9)

解决方案是添加另一个转义反斜杠:

>>> json.loads('{"a": "2\\""}')
{u'a': u'2"'}

对于完整的jsonFlickrApi响应,您可以使用re module添加这些额外的反斜杠:

>>> import re
>>> response = """jsonFlickrApi({ "photo": { "id": "8566959299", "secret": "141af38562", "server": "8233", "farm": 9, "camera": "Apple iPhone 4S", 
...     "exif": [
...       { "tagspace": "JFIF", "tagspaceid": 0, "tag": "JFIFVersion", "label": "JFIFVersion", 
...         "raw": { "_content": 1.01 } },
...       { "tagspace": "JFIF", "tagspaceid": 0, "tag": "ResolutionUnit", "label": "Resolution Unit", 
...         "raw": { "_content": "inches" } },
...       { "tagspace": "JFIF", "tagspaceid": 0, "tag": "XResolution", "label": "X-Resolution", 
...         "raw": { "_content": 72 }, 
...         "clean": { "_content": "72 dpi" } },
...       { "tagspace": "JFIF", "tagspaceid": 0, "tag": "YResolution", "label": "Y-Resolution", 
...         "raw": { "_content": 72 }, 
...         "clean": { "_content": "72 dpi" } },
...       { "tagspace": "GPS", "tagspaceid": 0, "tag": "GPSLatitudeRef", "label": "GPS Latitude Ref", 
...         "raw": { "_content": "North" } },
...       { "tagspace": "GPS", "tagspaceid": 0, "tag": "GPSLatitude", "label": "GPS Latitude", 
...         "raw": { "_content": "39 deg 56' 44.40\"" }, 
...         "clean": { "_content": "39 deg 56' 44.40\" N" } },
...       { "tagspace": "GPS", "tagspaceid": 0, "tag": "GPSLongitudeRef", "label": "GPS Longitude Ref", 
...         "raw": { "_content": "East" } },
...       { "tagspace": "GPS", "tagspaceid": 0, "tag": "GPSLongitude", "label": "GPS Longitude", 
...         "raw": { "_content": "116 deg 16' 10.20\"" }, 
...         "clean": { "_content": "116 deg 16' 10.20\" E" } }
...     ] }, "stat": "ok" })"""
>>> quoted_resp = re.sub('deg ([^"]+)"', r'deg \1\\"', response[14:-1])

然后可以在调用json.loads时使用引用的响应,然后您可以轻松地在新生成的字典结构中访问所需的数据:

>>> photodict = json.loads(quoted_resp)
>>> for meta in photodict['photo']['exif']:                                                                                                               
...     if meta["tagspace"] == "GPS" and meta["tag"] == "GPSLongitude":
...         print(meta["clean"]["_content"])
... 
116 deg 16' 10.20" E