我想知道如何格式化CSV文件,就像Twitter存档一样,这样R读取它就没有问题(在没有解决方案的情况下遇到了一堆问题)。 Twitter存档属于用户时间线,我的CSV(我将使用R执行情绪分析)是包含推文的搜索结果。
Twitter档案的样本
"tweet_id","in_reply_to_status_id","in_reply_to_user_id","timestamp","source","text","retweeted_status_id","retweeted_status_user_id","retweeted_status_timestamp","expanded_urls"
"81423594213695488","","","2016-12-29 14:18:08 +0000","<a href=""http://twitter.com/download/android"" rel=""nofollow"">Twitter for Android</a>","RT @SwiftOnSecurity: We're going to tell kids that laptops used to store data on tiny mirrors spinning @ 7200rpm and they're going to think…","814187405175570432","2436389418","2016-12-28 19:12:58 +0000",""
"876926582348550143","","","2016-12-22 13:29:16 +0000","<a href=""http://twitter.com/download/android"" rel=""nofollow"">Twitter for Android</a>","RT @MKBHD: Shout-out to everyone going home and becoming family tech support for the holidays","811910809521680384","29873662","2016-12-22 12:26:36 +0000",""
到目前为止我设法做了什么
"text"
b'RT @notCORYGREGORY: when hillary uses a private email server asking how to print recipes vs when trump takes healthcare from 20+ million am\xe2\x80\xa6'
b'RT @Salon: Germany is giving up on President Trump'
我是如何用Python做的:
csvFile = open('tweets.csv', 'a')
csvWriter = csv.writer(csvFile, delimiter=',')
for tweet in tweepy.Cursor(api.search,
q="trump",
rpp=100,
result_type="recent",
include_entities=True,
lang="en").items(5):
print (tweet.text)
csvWriter.writerow([tweet.text.encode('utf-8')])
csvFile.close()
我对R
中的解决方案持开放态度答案 0 :(得分:0)
我不完全理解你的问题,但你可能想看一下R中的twitteR库,特别是函数“twListToDF”。如果你将它与write.csv结合起来,你就可以对你以csv格式收集的推文,R也能读取。
write.csv(twListToDF(your_tweets), file="your_tweets.csv")