Question

Hacker News发布了一个API，如何在Python中使用它？

我想获得所有热门帖子。我尝试使用urllib，但我认为我做得不对。

这是我的代码：

import urllib2
response = urllib2.urlopen('https://hacker-news.firebaseio.com/v0/topstories.json?print=pretty')
html = response.read()
print response.read()

它只打印空

''

我错过了一条线，更新了我的代码。

Answer 1

作为@jonrsharpe，解释read()只是一次操作。因此，如果您打印html，您将获得所有ID的列表。如果您浏览该列表，则必须再次提出每个请求以获取每个ID的故事。

首先，您必须将收到的数据转换为python列表并完成所有操作。

base_url =  'https://hacker-news.firebaseio.com/v0/item/{}.json?print=pretty'
top_story_ids = json.loads(html)
for story in top_story_ids:
    response = urllib2.urlopen(base_url.format(story))
    print response.read()

而不是所有这些，你可以使用haxor，它是Hacker News API的Python包装器。以下代码将获取所有热门故事的ID：

from hackernews import HackerNews
hn = HackerNews()
top_story_ids = hn.top_stories()
# >>> top_story_ids
# [8432709, 8432616, 8433237, ...]

然后你可以通过那个循环并打印所有这些循环，例如：

for story in top_story_ids:
   print hn.get_item(story)

免责声明：我写了haxor。

Answer 2

你应该

print html

而不是

print response.read()

为什么呢？因为read是一次性操作;在你完成它之后，你不能重复它：

>>>import ullrib2
>>> response = urllib2.urlopen('https://hacker-news.firebaseio.com/v0/topstories.json?print=pretty')
>>> response.read()
'[ 8445087, 8444739, 8444603, 8443981, 8444976, 8443902, 8444252, 8444634, 8444931, 8444272, 8444025, 8441939, 8444510, 8444640, 8443830, 8445076, 8443470, 8444785, 8443028, 8444077, 8444832, 8443841, 8443467, 8443309, 8443187, 8443896, 8444971, 8443360, 8444601, 8443287, 8441095, 8441681, 8441055, 8442712, 8444909, 8443621, 8442596, 8443836, 8442266, 8443298, 8445122, 8443096, 8441699, 8442119, 8442965, 8440486, 8442093, 8443393, 8442067, 8444989, 8440985, 8444622, 8438728, 8442555, 8444880, 8442004, 8443185, 8444370, 8436210, 8437671, 8439641, 8443727, 8441702, 8436309, 8441041, 8437367, 8422087, 8441711, 8438063, 8444212, 8439408, 8442049, 8440989, 8439367, 8438515, 8437403, 8435278, 8442486, 8442730, 8428522, 8438904, 8443450, 8432703, 8430412, 8422928, 8443635, 8439267, 8440191, 8439560, 8437230, 8442556, 8439977, 8444140, 8441682, 8443776, 8441209, 8428632, 8441388, 8422599, 8439547 ]\n'
>>> response.read()
''

但是，在您的情况下，您已将read的字符串分配给名称html，因此您仍然可以访问该字符串。

获得故事ID后，您可以通过'.../v0/item/{item number}.json?print=pretty'：

访问每个故事ID

>>> response = urllib2.urlopen('https://hacker-news.firebaseio.com/v0/item/8445087.json?print=pretty')
>>> print response.read()
{
  "by" : "lalmachado",
  "id" : 8445087,
  "kids" : [ 8445205, 8445195, 8445173, 8445103 ],
  "score" : 21,
  "text" : "",
  "time" : 1413116430,
  "title" : "Show HN: Powerful ASCII art editor designed for the Mac",
  "type" : "story",
  "url" : "http://monodraw.helftone.com/"
}

在继续之前，您应该通读the API documentation。同样值得掌握the json module。

如何在Python中使用Hacker News API？

2 个答案: