Question

在Python 2.7中使用Tweepy将搜索查询的结果存储到CSV文件中。我试图弄清楚如何只从我的结果集中打印出唯一的tweet.ids数。我知道（len（list））有效，但显然我还没有在这里初始化列表。我是python编程的新手，所以解决方案可能很明显。任何帮助表示赞赏。

for tweet in tweepy.Cursor(api.search, 
                q="Wookie", 
                #since="2014-02-14", 
                #until="2014-02-15", 
                lang="en").items(5000000):
    #Write a row to the csv file
    csvWriter.writerow([tweet.created_at, tweet.text.encode('utf-8'), tweet.favorite_count, tweet.user.name, tweet.id])
    print "...%s tweets downloaded so far" % (len(tweet.id))
csvFile.close()

Answer 1

您可以使用set来跟踪您目前所见的唯一ID，然后打印出来：

ids = set()
for tweet in tweepy.Cursor(api.search, 
                q="Wookie", 
                #since="2014-02-14", 
                #until="2014-02-15", 
                lang="en").items(5000000):
    #Write a row to the csv file
    csvWriter.writerow([tweet.created_at, tweet.text.encode('utf-8'), tweet.favorite_count, tweet.user.name, tweet.id])
    ids.add(tweet.id) # add new id
    print "number of unique ids seen so far: {}".format(len(ids))
csvFile.close()

集合与列表类似，只是它们只保留唯一元素。它不会在集合中添加重复项。

Python打印不同的值

1 个答案: