下面是我创建一个空数据框的方法,我打算从数据源一次填充1行。
numTweets=31
finalDataFrame = as.data.frame( matrix(NA, numTweets-1, 23), stringsAsFactors=FALSE)
names(finalDataFrame) = c( "TweetID", "TweetTime", "Text", "Source",
"UserID", "Username", "Screenname", "FollowerCount", "FriendCount",
"Location", "Latitude", "Longitude", "ReplyTweetID", "ReplyUserID",
"ReplyScreenname", "RetweetID", "RetweetCreated", "RetweetUsername",
"RetweetScreename", "RetweetLocation", "RetweetFollowers", "RetweetFriends",
"RetweetSource" )
我插入的一行示例也在下面
print( thisRow, row.names=FALSE )
TweetID TweetTime Text
877010425019158529 Tue Jun 20 03:49:14 +0000 2017 @OmniDestiny I would recommend trying to find the facebook group for evergreen because i think their school facebook page got shut down.
Source UserID Username Screenname FollowerCount FriendCount Location Latitude Longitude ReplyTweetID ReplyUserID ReplyScreenname RetweetID
<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a> 843603187298779137 Albert HellhoyZ 4 72 Bellevue, WA 0 0 876742560328417281 4726147296 OmniDestiny NA
RetweetCreated RetweetUsername RetweetScreename RetweetLocation RetweetFollowers RetweetFriends RetweetSource
NA NA NA NA NA NA NA
所以,这一行看起来非常精细,我创建的数据框用于存储它看起来很好。但是,当我尝试将其复制到...时
## Z minus 1 since we started our loop at 2
finalDataFrame[z-1, ] = thisRow
许多价值观变得奇怪。例如,thisRow完全显示ReplyTweetID值(int64值)为876742560328417281,但是当我在finalDataFrame中的R中查看它时....
finalDataFrame[1, "ReplyTweetID" ]
[1] 0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000046816189162956993
>
我不确定是什么能引起这种剧烈的变化。有任何想法吗?
编辑:我很确定它必须是因为值是int64,并且Matrix不喜欢它。但是,有没有办法为此准备Matrix?当我首先制作“thisRow”时,我可以选择toString(IDVALUEHERE),但这似乎不应该是必要的。