Python:将JSON(由URL返回)转换为List

时间:2011-01-08 14:04:08

标签: python json

我正在请求youtube搜索字词以用于jquery自动填充,但我很难将URL响应转换为正确的格式。

在我的(Django / Python)视图中,我做了:

data2 = urllib2.urlopen('http://suggestqueries.google.com/complete/search?hl=en&ds=yt&client=youtube&hjson=t&jsonp=window.yt.www.suggest.handleResponse&q=jum&cp=3')

(为简单起见,我硬编码了搜索词='jump')

如果我data2.read()我得到了我认为的JSON(将网址复制粘贴到浏览器中也会返回此内容。)

window.yt.www.suggest.handleResponse(["jum",[["jumpstyle","","0"],["jump","","1"],["jump around","","2"],["jump on it","","3"],["jumper","","4"],["jump around house of pain","","5"],["jumper third eye blind","","6"],["jumbafund","","7"],["jump then fall taylor swift","","8"],["jumpstyle music","","9"]],"","","","","",{}])

我需要以jquery autocomplete可以读取的格式返回它。我知道如果我可以将它放入列表中,它会起作用,例如,mylist = ['jumpstyle', 'jump', 'jump around', ...]

然后在返回之前将其转换回json:

json.dumps(mylist)

(如果我直接如上所述直接定义mylist,则此方法有效。)

但我无法从URL返回的数据中获取一个简单的列表(然后我将其转换回JSON)或某些形式的JSON,我可以直接返回以供自动完成使用。

我已尝试过,等等,

j2 = json.loads(data2)

j2 = json.loads(data2.read())

希望有人可以提供帮助!

4 个答案:

答案 0 :(得分:13)

删除&jsonp=window.yt.www.suggest.handleResponse部分

import json
import urllib2

data = urllib2.urlopen('http://suggestqueries.google.com/complete/search?hl=en&ds=yt&client=youtube&hjson=t&q=jum&cp=3')

j = json.load(data)
k = [i for i, j, k in j[1]]
l = json.dumps(k)

答案 1 :(得分:3)

您正在执行JSON-P请求,该请求会自动将JSON包装在javascript回调函数中,您实际上已在请求中指定了该函数:)

从您的请求中删除JSON-P参数,您将直接从请求获得直接JSON,而无需进行任何额外的python任务。

这应该是您的要求:

http://suggestqueries.google.com/complete/search?hl=en&ds=yt&client=youtube&hjson=t&q=jum&cp=3

它会返回:

["jum",[["jumpstyle","","0"],["jump","","1"],["jump around","","2"],["jump on it","","3"],["jumper","","4"],["jump around house of pain","","5"],["jumper third eye blind","","6"],["jumbafund","","7"],["jump then fall taylor swift","","8"],["jumpstyle music","","9"]],"","","","","",{}]

答案 2 :(得分:0)

它不是json它的javascript,如果你想用它作为json你必须剥离javascript部分:

j2 = json.loads(data2[37:-1])

但您可以更改网址(删除'jsonp = window.yt.www.suggest.handleResponse'部分)以获得纯json输出:

>>> data2 = urllib2.urlopen('http://suggestqueries.google.com/complete/search?hl=en&ds=yt&client=youtube&hjson=t&q=jum&cp=3')
>>> json.loads(data2.read())
[u'jum', [[u'jumpstyle', '', u'0'], [u'jump', '', u'1'], [u'jump around', '', u'2'], [u'jump on it', '', u'3'], [u'jumper', '', u'4'], [u'jump around house of pain', '', u'5'], [u'jumper third eye blind', '', u'6'], [u'jumbafund', '', u'7'], [u'jump then fall taylor swift', '', u'8'], [u'jumpstyle music', '', u'9']], '', '', '', '', '', {}]

答案 3 :(得分:0)

页面的输出不是正确的json编码数据。你需要删除包装它的js函数调用。

这样做:

import urllib2
import re
import json

data2 = urllib2.urlopen('http://suggestqueries.google.com/complete/search?' +    
   'hl=en&ds=yt&client=youtube&hjson=t&jsonp=window.yt.' + 
   'www.suggest.handleResponse&q=jum&cp=3')

data = re.compile('^[^\(]+\(|\)$').sub('', data2.read())
parsedData = json.loads(data)

parsedData现在是python数组。