Question

我想基于其标签在特定日期撤消推文。出于我的目的，我使用tweepy和以下代码：

results = api.search('#brexit OR #EUref', since="2016-06-24",
until="2016-06-30", monitor_rate_limit=True,wait_on_rate_limit=True)

with open('24june_bx.txt', 'w') as f:
    for tweet in results:
        try:
            f.write('{}\n'.format(tweet.text.decode('utf-8')))
        except BaseException as e:
            print 'ascii codec can\'t encode characters'
            continue

正如您所看到的，我正在尝试在投票后的第二天使用主题标签'#brexit'或'EUref'获取所有推文并将其存储在文件'24june_bx.txt'中。它有点工作......但在文件中我只得到10条推文。终端还报告异常7次并打印'ascii codec ...'。

您认为可能是什么问题？

对于这个noobish问题感到抱歉。

非常感谢。

Answer 1

使用 io lib，将编码设置为utf-8以处理编码错误：

==

如果您使用常规打开，则需要编码到utf-8，因为您已经有一个unicode字符串：

import io

with io.open('24june_bx.txt', 'w', encoding="utf-8") as f:
    for tweet in results:
        try:
            f.write(u'{}\n'.format(tweet.text))
        except UnicodeEncodeError as e:
            print(e)

Answer 2

您可以将Tweepy的Cursor与api.search一起使用，以获得任意数量的推文。

def search_tweets_from_twitter_home(query, max_tweets, from_date, to_date):
   """search using twitter search_home. "result_type=mixed" means both 
      'recent' & 'popular' tweets will be returned in search results.
       returns the generator (for memory efficiency)
    """

 searched_tweets = ( status._json for status in tweepy.Cursor(api.search, 
                     q=query, count=300, since=from_date, until=to_date,   
                     result_type="mixed", lang="en" ).items(max_tweets) )
 return searched_tweets

这将返回您在max_tweets中提到的尽可能多的推文，假设有许多推文可供返回。

然后，您可以迭代生成器并将其写入文件。

Answer 3

'＃brexit或#EUref'

我认为使用此作为搜索查询将返回包含该特定字符串的推文。请尝试仅使用'#brexit'和'#EUref'，然后再连接结果。

Answer 4

尝试添加

# -*- coding: utf-8 -*-

在脚本的第一行

返回基于hashtag的推文并将其保存到file.txt

4 个答案: