我正在对Twitter的数据进行一些分析。我有一个向量,包含来自twitter的数据集的一些推文ID。我需要根据这些ID检索推文。这是我正在尝试的:
我的矢量包含ID:
inf_tweets_ic
Vertex sequence:
[1] "513674833322725376" "513688141702922240" "513722572106899456" "513698876625154048" "513699086495514624"
[6] "513699262371094528" "513699249142251520" "513699475613687808" "513699853516279808" "513700009519644672"
[11] "513700240851886080" "513700414265765888" "513700383751823360" "513700800430739456" "513700785201229824"
推特数据集如下:
tweets[1:4,c(1:3,5:7)]
tweet_id user_id screen_name hashtags retweet retweeted
1 513657169858674688 24291371 MobilePunch false false
2 513657169745440768 59461030 OBAMA_GAMES tcot;cspj;news;teaparty;tlot;ows;p2; false false
3 513657165731889152 82110213 onevoicechange false false
4 513657163479515136 2654587483 NycInfidel1 ImpeachObama; false false
然后我将tweet_id从long更改为factor以便轻松尝试检索inf_tweet_ic集中的推文,只是为了得到这个:
tweets$tweet_id <- factor(tweets$tweet_id)
tweets[tweets$tweet_id %in% inf_tweets_ic,]
[1] tweet_id user_id screen_name text hashtags
[6] retweet retweeted location reply_to_screen_name lang
[11] timestamp fav_count rt_count
<0 rows> (or 0-length row.names)
我也尝试将inf_tweets_ic转换为数字,这样我就不必将tweet_id属性更改为factor,但as.numeric()方法排序编码为整数,而不是转换:
as.numeric(inf_tweets_ic)
[1] 164 347 886 491 493 495 496 497 501 504 505 509 510 512 513 523 525 537 592 941 428 410 283 589 546 606 608 611
[29] 613 614 616 621 625 661 671 704 708 709 710 718 719 722 724 725 734 742 946 681 699
我做错了什么?