嗨:我有一个TwitteR脚本正好抓取推文。但是,当我将结果转换为data.frame并使用write.table()
写出时,一些推文会被笨拙地分开。当我尝试分析这些问题时,这将会带来问题吗?
我已经附加了csv文件中的一些图像以试图说明问题。
我看到在很多行中都有这些奇怪的符号,我认为这些符号与字符编码有关,但分裂不一定在这些点上发生。所以我不知道会发生什么。
代码在这里:
options(httr_oauth_cache=T)
Sys.setenv(TZ='EST')
setup_twitter_oauth(consumer_key, consumer_secret, access_token,
access_token_secret)
#Get #onpoli tweet
onpoli<-searchTwitter('#onpoli+#pcpoldr+#pcpo', resultType='recent', n=1500)
#Turn to data.frames
onpolidf<-twListToDF(onpoli)
#Write out to .csv files
write.table(onpolidf, paste('Tweets/', format(Sys.time(), "%m-%d-%H-%M"),
'.csv', sep=''), append=T, sep=',', col.names=T)
SessionInfo()的结果如下:
R version 3.4.1 (2017-06-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.3
Matrix products: default
BLAS:
/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] twitteR_1.1.9
loaded via a namespace (and not attached):
[1] bit_1.1-12 httr_1.2.1 compiler_3.4.1 rjson_0.2.15 R6_2.2.2
DBI_0.7 tools_3.4.1 curl_2.7
[9] yaml_2.1.16 bit64_0.9-7 openssl_0.9.6