Question

我有一个json网址，我正在尝试从响应中提取数据。下面是我的代码

url = urllib2.urlopen("https://i1.adis.ws/s/foo/M0011126_001_SET.js?func=app.mjiProduct.handleJSON&protocol=https")
content = url.read()
soup = BeautifulSoup(content, "html.parser")
print(soup.prettify())
print(soup.items)
newDictionary=json.loads(str(soup))

以下是response.content

app.mjiProduct.handleJSON（{＆＃34;名称＆＃34;：＆＃34; M0011126_001_SET＆＃34;＆＃34;项目＆＃34;：[{＆＃34;类型＆＃34;：＆＃ 34; IMG＆＃34;＆＃34; SRC＆＃34;：＆＃34; HTTPS：//i1.adis.ws/i/foo/M0011126_001_MAIN"，＆＃34;宽度＆＃34;：3200，＆＃34;高度＆＃34;：4800，＆＃34;格式＆＃34;：＆＃34; TIFF＆＃34;＆＃34;不透明＆＃34;：＆＃34;真＆＃34;}，{＆＃34;类型＆＃34;：＆＃34; IMG＆＃34;＆＃34; SRC＆＃34;：＆＃34; HTTPS：//i1.adis.ws/i/foo/M0011126_001_ALT1"，＆＃34;宽度＆＃34;：3200，＆＃34;高度＆＃34;：4800，＆＃34;格式＆＃34;：＆＃34; TIFF＆＃34;＆＃34;不透明＆＃34;：＆＃ 34;真＆＃34;}，{＆＃34;类型＆＃34;：＆＃34; IMG＆＃34;＆＃34; SRC＆＃34;：＆＃34; HTTPS：//i1.adis.ws/ I /富/ M0011126_001_ALT2＆＃34;＆＃34;宽度＆＃34;：3200，＆＃34;高度＆＃34;：4800，＆＃34;格式＆＃34;：＆＃34; TIFF＆＃34;，＆＃34;不透明＆＃34;：＆＃34;真＆＃34;}]}）;

我是JSON的新手，无法理解回复。另外，我需要以json或某种形式解析响应以提取图像源。但上面的代码给出了我的错误。

无法解码JSON对象

有人可以指导我吗？感谢

Answer 1

首先，您的网址无法恢复app.mjiProduct.handleJSON({"status":"error","errorMsg":"Failed to get set"});

第二件事是您不必将内容传递给Beautifulsoup，您可以直接将其传递给json，就像我在没有Beautifulsoup对象的代码中所做的那样。

我使用httpbin进行测试，但这应该在您的网址中有效。我用了python3 tho

from urllib.request import urlopen
import json
url = urlopen("http://httpbin.org/get")
content = url.read()
newDictionary=json.loads(content)
print(newDictionary)

输出：{'args': {}, 'headers': {'Accept-Encoding': 'identity', 'Connection': 'close', 'Host': 'httpbin.org', 'User-Agent': 'Python-urllib/3.6'}, 'origin': '', 'url': 'http://httpbin.org/get'}

Answer 2

以下代码对我有用。

array = "1,2,3,4".split(',');

实际上，我发现json_data是字符串类型，因为该字符串的格式，我无法解码，这是

app.mjiProduct.handleJSON（REQUIRED JSON）

所以，首先我过滤了我的字符串，然后用json加载它，问题就解决了。

Answer 3

响应不包含有效的JSON。它看起来像一个可执行代码（可能是JavaScript）。但是{"name":"M0011126_001_SET","items":[...]}部分是有效的JSON。因此，如果您确定响应始终具有此格式，则可以删除函数调用，如下所示：

content = url.read()[26:-2] # Cut first 26 characters and last two
newDictionary=json.loads(str(content))

我不太了解Beautiful Soup，但我发现它是一个用于处理HTML文件的库，而你的响应不是HTML，所以我认为你不应该使用它。

无法理解和解析JSON URL响应

3 个答案: