我正在使用此代码调用NY Times API并获取有关所选搜索查询的文章的json数据。
import urllib
import re
import json
htmltext = urllib.urlopen('******http://call******')
data = json.load(htmltext)
print data
打印出如下结构的结果:
{u'status': u'OK', u'response': {u'docs': [{u'type_of_material': u'Article', u'blog': [], u'news_desk': None, u'lead_paragraph': u'Hon. PRESTON KING, one of the ablest, most upright, and most influential members of the Democratic party of this State, thus expression his opinion in regard to political prospects in a letter to a friend: OGDENSBURG, Saturday, Sept. 16, 1854.', u'headline': {u'main': u'POLITICAL.; New-York Politics--Letter from Preston King.'}, u'abstract': u'Letter to Jerry Rescue Celebration', u'print_page': u'8', u'word_count': 1526, u'_id': u'4fbfd3e945c1498b0d00ddca', u'snippet': u'Hon. PRESTON KING, one of the ablest, most upright, and most influential members of the Democratic party of this State, thus expression his opinion in regard to political prospects in a letter to a friend: OGDENSBURG, Saturday, Sept. 16, 1854.', u'source': u'The New York Times', u'web_url': u'http://query.nytimes.com/gst/abstract.html?res=950CE6DE1238EE3BBC4B53DFB667838F649FDE', u'multimedia': [], u'subsection_name': None, u'keywords': [{u'name': u'persons', u'value': u'KING, PRESTON'}, {u'name': u'persons', u'value': u'SUMNER CHARLES'}, {u'name': u'persons', u'value': u'BEECHER, HENRY WARD'}], u'byline': None, u'document_type': u'article', u'pub_date': u'1854-10-03T00:03:58Z', u'section_name': None}, {u'type_of_material': u'Article', u'blog': [], u'news_desk': None, u'lead_paragraph': u'MISSISSIPPI LAWS IN WANT OF REMODELING. GOVERNOR McWILLIE, of Mississippl, has summened an extra session of the State Legislature, to assemble on the first Monday in November next, In this State, as in others which have adopted the system of biennial sessions of their Legislatures. the plan has not been found to be the best for the interests of the people.', u'headline': {u'main': u'Article 1 -- No Title', u'kicker': u'1'}, u'abstract': None, u'print_page': u'3', u'word_count': 334, u'_id': u'4fbfe29945c1498b0d04bed8', u'snippet': u'MISSISSIPPI LAWS IN WANT OF REMODELING. GOVERNOR McWILLIE, of Mississippl, has summened an extra session of the State Legislature, to assemble on the first Monday in November next, In this State, as in others which have adopted the system of biennial...', u'source': u'The New York Times', u'web_url': u'http://query.nytimes.com/gst/abstract.html?res=9F06E7D61331EE34BC4952DFBE668383649FDE', u'multimedia': [], u'subsection_name': None, u'keywords': [], u'byline': None, u'document_type': u'article', u'pub_date': u'1858-08-11T00:03:58Z', u'section_name': None}, ... u'meta': {u'hits': 150, u'offset': 0, u'time': 38}}, u'copyright': u'Copyright (c) 2013 The New York Times Company. All Rights Reserved.'}
出于举例的目的,我在这里粘贴了2篇文章的数据(它实际上为10篇文章提供了数据)。
现在我想解析那些数据并提取所有'web_url'属性。我怎么能这样做?
我试过那段代码:
import urllib
import re
import json
htmltext = urllib.urlopen('******http://call******')
data = json.load(htmltext)
print data['web_url']
但它给了我这个错误:
Traceback (most recent call last):
File "json_trying.py", line 10, in <module>
print data["web_url"]
KeyError: 'web_url'
答案 0 :(得分:3)
花些时间查看回复的结构。
{
u'status': u'OK',
u'response': {
u'docs': [
{
...
u'web_url': u'http://query.nytimes.com/...',
...
}
{
...
}
]
}
}
for doc in data['response']['docs']:
print doc['web_url']
答案 1 :(得分:1)
答案 2 :(得分:0)
我认为web_url键是响应键的子项,您应该能够以数据[“response”] [“web_url”]的形式访问它。