我如何修复Python3 KeyError:'“id_str”'?

时间:2017-05-05 16:05:21

标签: python json python-3.x twitter runtime-error

我正在使用以下脚本抓取推文。在运行它时,我得到以下两个错误,但我最关心的是KeyError:

  

`Traceback(最近一次调用最后一次):文件“Exporter.py”,第76行,in   主要       got3.manager.TweetManager.getTweets(tweetCriteria,receiveBuffer)文件   “/Users/MacbookPro/PycharmProjects/cs446/trend_map-master/getoldtweets/got3/manager/TweetManager.py”   第40行,在getTweets中       json_to_use =((json ['items_html'])。format('div.js-stream-tweet'))KeyError:'“id_str”'

     

在处理上述异常期间,发生了另一个异常:

     

Traceback(最近一次调用最后一次):文件“Exporter.py”,第85行,in          main(sys.argv [1:])文件“Exporter.py”,第78行,在main中       除了arg:TypeError:不允许捕获不从BaseException继承的类

代码如下:

@staticmethod
def getTweets(tweetCriteria, receiveBuffer=None, bufferLength=100):
    refreshCursor = ''

    results = []
    resultsAux = []
    cookieJar = http.cookiejar.CookieJar()

    active = True

    while active:
        json = TweetManager.getJsonReponse(tweetCriteria, refreshCursor, cookieJar)
        if len(json['items_html'].strip()) == 0:
            break

        refreshCursor = json['min_position']
        # print(json['items_html']('div.js-stream-tweet'))
        json_to_use = ((json['items_html']).format('div.js-stream-tweet'))
        tweets = PyQuery(json_to_use)
        # print(tweets)

        # print(len(tweets))

        if len(tweets) == 0:
            break

        for tweetHTML in tweets:
            tweetPQ = PyQuery(tweetHTML)
            tweet = models.Tweet()

            usernameTweet = tweetPQ("span.username.js-action-profile-name b").text()
            txt = re.sub(
                r"\s+", " ", tweetPQ("p.js-tweet-text").text().replace('# ', '#').replace('@ ', '@'))
            retweets = int(tweetPQ("span.ProfileTweet-action--retweet span.ProfileTweet-actionCount").attr(
                "data-tweet-stat-count").replace(",", ""))
            favorites = int(tweetPQ("span.ProfileTweet-action--favorite span.ProfileTweet-actionCount").attr(
                "data-tweet-stat-count").replace(",", ""))
            dateSec = int(tweetPQ("small.time span.js-short-timestamp").attr("data-time"))
            id = tweetPQ.attr("data-tweet-id")
            permalink = tweetPQ.attr("data-permalink-path")
            user_id = int(tweetPQ("a.js-user-profile-link").attr("data-user-id"))

            geo = TweetManager.findLocation(usernameTweet)
            urls = []
            for link in tweetPQ("a"):
                try:
                    urls.append((link.attrib["data-expanded-url"]))
                except KeyError:
                    pass
            tweet.id = id
            tweet.permalink = 'https://twitter.com' + permalink
            tweet.username = usernameTweet

            tweet.text = txt
            tweet.date = datetime.datetime.fromtimestamp(dateSec)
            tweet.formatted_date = datetime.datetime.fromtimestamp(
                dateSec).strftime("%a %b %d %X +0000 %Y")
            tweet.retweets = retweets
            tweet.favorites = favorites
            tweet.mentions = " ".join(re.compile('(@\\w*)').findall(tweet.text))
            tweet.hashtags = " ".join(re.compile('(#\\w*)').findall(tweet.text))
            tweet.geo = geo
            tweet.urls = ",".join(urls)
            tweet.author_id = user_id
            tweet.tweetPQ = tweetPQ
            tweet.rawhtml = tweetHTML
            tweet.tweets = tweets
            tweet.alljson = json

            results.append(tweet)
            resultsAux.append(tweet)

            if receiveBuffer and len(resultsAux) >= bufferLength:
                receiveBuffer(resultsAux)
                resultsAux = []

            if tweetCriteria.maxTweets > 0 and len(results) >= tweetCriteria.maxTweets:
                active = False
                break

    if receiveBuffer and len(resultsAux) > 0:
        receiveBuffer(resultsAux)

    return results

我不确定问题是什么。我尝试查找它,我没有看到主要是我的格式化如何发出请求的问题。任何帮助将不胜感激!!

0 个答案:

没有答案