Question

我有一个小脚本，用于监视RSS上的“用python标记的新问题”，特别是在SO上。它在循环的第一次迭代中将feed存储在变量中，然后不断地根据存储在变量中的变量检查feed。如果Feed更改，它会更新变量并将最新条目输出到控制台，并播放声音文件以提醒我有新问题。总而言之，它非常方便，因为我无需关注任何事情。然而，实际发布的新问题与检测Feed更新的脚本之间存在时间差异。这些差异似乎在时间长度上有所不同，但通常情况下，它不是即时的，并且在对某个问题采取足够的行动之前往往不会提醒我，因此它已经被处理掉了。并非总是如此，但一般而言。 有没有办法让我确保更快或更快的更新/提醒？或者这是不是很好？（我突然想到，这个特定的饲料只有在问题上有实际行动时才会更新..任何人都知道是否属于这种情况？）

我是否误解了rs的实际工作方式？

import urllib2
import mp3play
import time
from xml.dom import minidom



def SO_notify():
    """ play alarm when rss is updated """

rss = ''
filename = "path_to_soundfile"
mp3 = mp3play.load(filename)
mp3.volume(25)

while True:  
    html = urllib2.urlopen("http://stackoverflow.com/feeds/tag?tagnames=python&sort=newest")
    new_rss = html.read()
    if new_rss == rss:
        continue
    rss = new_rss
    feed = minidom.parseString(rss)
    new_entry = feed.getElementsByTagName('entry')[0]
    title = new_entry.getElementsByTagName('title')[0].childNodes[0].nodeValue
    print title
    mp3.play()
    time.sleep(30) #Edit - thanks to all who suggested this

SO_notify()

Answer 1

类似的东西：

import requests
import mp3play
import time

curr_ids = []
filename = "path_to_soundfile"
mp3 = mp3play.load(filename)
mp3.volume(25)

while True:
    api_json = requests.get("http://api.stackoverflow.com/1.1/questions/unanswered?order=desc&tagged=python").json()
    new_questions = []
    all_questions = []
    for q in api_json["questions"]:
        all_questions.append(q["question_id"])
        if q["question_id"] not in curr_ids:
            new_questions.append(q["question_id"])
    if new_questions:
        print(new_questions)
        mp3.play()
    curr_ids = all_questions
    time.sleep(30)

在这里使用了requests包，因为urllib给了我一些编码麻烦。

Answer 2

恕我直言，你可以有2个解决方案，具体取决于你想要的方法：

使用JSON - 这将为您提供包含所有条目的精彩词典。
使用RSS（XML）。在这种情况下，您需要feedparser之类的东西来处理您的XML。

无论哪种方式，代码都应该是这样的：

    # make curr_ids a dictionary for easier lookup
    curr_ids = []

    filename = "path_to_soundfile"
    mp3 = mp3play.load(filename)
    mp3.volume(25)

    # Loop
    while True:
        # Get the list of entries in objects
        entries = get_list_of_entries()

        new_ids = []

        for entry in entries:
            # Check if we reached the most recent entry
            if entry.id in curr_ids:
                # Force loop end if we did
                break

            new_ids.append(entry.id)

            # Do whatever operations
            print entry.title

        if len(new_ids) > 0:
           mp3.play()
           curr_ids = new_ids
        else:
           # No updates in the meantime
           pass

        sleep(30)

几点说明：

我按“最旧”的顺序排序，因此打印的条目看起来像一个流，最近的一个是最后打印出来的。
new_ids的目的是将id列表保持在最低限度。否则，随着时间的推移，查找会变慢
get_list_of_entries()是一个从源获取条目的容器（来自XML的对象或来自JSON的dict）。根据您想要的方法，引用它们是不同的（但原理是相同的）

rss不会立即更新

2 个答案: