我的代码是
import json
import urllib2
# bit.ly data path
bitly_data_path = 'http://bitly.measuredvoice.com/bitly_archive/usagov_bitly_data2012-10-31-1351662469'
file = urllib2.urlopen(bitly_data_path)
records = [json.dumps(json.loads(line)) for line in file]
type(records[0])
当我跑步时,我明白了,
Out[32]: str
当我将最后一行更改为
时type(dict(records[0]))
我得到了
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/Users/hhimanshu/code/p/python/notebooks/<ipython-input-33-eb9c026d2b46> in <module>()
8 file = urllib2.urlopen(bitly_data_path)
9 records = [json.dumps(json.loads(line)) for line in file]
---> 10 type(dict(records[0]))
ValueError: dictionary update sequence element #0 has length 1; 2 is required
url上的数据看起来像......
{ "a": "Mozilla\/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident\/4.0)", "c": "US", "nk": 1, "tz": "America\/Chicago", "gr": "MO", "g": "f8zxQx", "h": "OYt09h", "l": "fhah03269", "hh": "bit.ly", "r": "direct", "u": "http:\/\/portal.hud.gov\/hudportal\/HUD?src=\/i_want_to\/talk_to_a_housing_counselor", "t": 1351662469, "hc": 1350056673, "kw": "HUDcounsel", "cy": "Saint Louis", "ll": [ 38.639900, -90.183998 ] }
{ "a": "Mozilla\/5.0 (Windows NT 6.1; WOW64; rv:14.0) Gecko\/20100101 Firefox\/14.0.1", "c": "US", "nk": 1, "tz": "America\/Los_Angeles", "gr": "WA", "g": "YlMtB7", "h": "YlMtB6", "l": "twitterfeed", "al": "en-us,en;q=0.5", "hh": "1.usa.gov", "r": "http:\/\/www.facebook.com\/l.php?u=http%3A%2F%2F1.usa.gov%2FYlMtB6&h=1AQFYTwLaAQFVlA4rmQgAET0ZHeNpBYEtVPYb18UJmHGjPQ&s=1", "u": "http:\/\/alerts.weather.gov\/cap\/wwacapget.php?x=OR124CCAE88F2C.WindAdvisory.124CCAE9CD24OR.MFRNPWMFR.b3351cd23df7ee2759f052f670a174df&utm_medium=facebook&utm_source=twitterfeed", "t": 1351662473, "hc": 1351661843, "cy": "Poulsbo", "ll": [ 47.753700, -122.612297 ] }
{ "a": "Mozilla\/4.0 (compatible; MSIE 6.0; Windows NT 5.1)", "c": "US", "nk": 0, "tz": "America\/Indianapolis", "gr": "IN", "g": "rsE6tG", "h": "oeow2l", "l": "addthis", "hh": "bit.ly", "r": "direct", "u": "http:\/\/www.gsa.gov\/portal\/content\/104109#.TjjjzkOlPCo.twitter", "t": 1351662478, "hc": 1312336782, "cy": "Indianapolis", "ll": [ 39.806198, -86.140701 ] }
{ "a": "Mozilla\/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident\/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727)", "c": "RU", "nk": 1, "tz": "Europe\/Moscow", "gr": "48", "g": "Q33nV2", "h": "TQWLo1", "l": "jerrybrown2010", "al": "ru", "hh": "bit.ly", "r": "http:\/\/yandex.ru\/yandsearch?text=bit.ly&lr=213", "u": "http:\/\/gov.ca.gov\/news.php?id=17800", "t": 1351662480, "hc": 1351565210, "cy": "Moscow", "ll": [ 55.752201, 37.615601 ] }
{"_heartbeat_":1351662481}
...
如何将我的内容更改为json
条记录?
答案 0 :(得分:2)
你正试图用字符串制作字典。离开你做的最后一个json.dumps电话。 dumps
将字典转换为字符串。 loads
将字符串转换为字典。
所以这是你的行:
records = [json.loads(line) for line in file]
答案 1 :(得分:1)
json_data = urllib2.urlopen(bitly_data_path)
data = json.loads(json_data.read())