Question

我发现这个python代码通过自定义搜索查询来抓取twitter：

https://github.com/tomkdickinson/Twitter-Search-API-Python/blob/master/TwitterScraper.py

我想将此代码的结果存储到csv文件中。

我尝试在for循环中的第245行附近添加csv编写器，根据我的搜索查询打印出推文，但csv文件结果为空白

def save_tweets(self, tweets):
    """
    Just prints out tweets
    :return: True always
    """
    for tweet in tweets:
        # Lets add a counter so we only collect a max number of tweets
        self.counter += 1
        if tweet['created_at'] is not None:
            t = datetime.datetime.fromtimestamp((tweet['created_at']/1000))
            fmt = "%Y-%m-%d %H:%M:%S"
            myCsvRow = log.info("%i [%s] - %s" % (self.counter, t.strftime(fmt), tweet['text']))
            fd = open('document.csv','a')
            fd.write(myCsvRow)
            fd.close()

    return True

此外，第170行的代码中有一条评论提到：

@abstractmethod
def save_tweets(self, tweets):
    """
    An abstract method that's called with a list of tweets.
    When implementing this class, you can do whatever you want with these tweets.
    """

如何使用此课程保存推文？

Answer 1

您的问题似乎就行了：

myCsvRow = log.info("%i [%s] - %s" % (self.counter, t.strftime(fmt), tweet['text']))

查看您正在使用的GitHub页面上的代码，我可以看到log是一个python记录器。 log.info的目的是编写在某处给出的字符串（例如：控制台，文件或这些或其他地方的任意组合）。它不会返回值，因此myCsvRow将为空。

你想要的更有可能：

myCsvRow = "%i [%s] - %s" % (self.counter, t.strftime(fmt), tweet['text'])

虽然，有几个注意到：

（1）您没有在条目之间添加逗号，这对于CSV（CSV =逗号分隔值）和

是常见的

（2）当你的某个字段是可能包含逗号的文本字段时，尝试写出csv行实际上有点冒险。如果您天真地按原样写出文本，则推文中的逗号本身会导致解释CSV的任何内容认为行中有额外的CSV字段。幸运的是，python附带了一个csv库，可以帮助您避免这些问题。

如何使用python scraper将结果保存到csv？

1 个答案: