无法在MySQL DB中插入包含巨大Twitter响应字符串的多行?

时间:2015-08-01 09:27:30

标签: php mysql twitter mysqli twitter-streaming-api

以下是将200条最新推文(Twitter API提供的JSON)添加到Twitter用户的数据库中的代码。这里var my1DArray = [1,2,3,4,5,6,7,8,9]; var my2DArray = Utils.arrayToArrays(my1DArray,size:3); //[[1,2,3],[4,5,6],[7,8,9]] 数组只保存一个用户(例如:@katyperry),但最终会保留更多。对于阵列200中的每个用户,通过Twitter API引入推文。所有这些数据收集工作正常。

现在出现问题:对于每个用户,我在MySQL数据库中插入200条推文(我只能使用MySQL,没有其他选择)表。现在我明白了每个$users JSON字符串化是巨大的(也许这就是问题)。

TwitterResp示例:

TwitterResp

因此,当我在{"created_at":"Thu Jul 23 18:25:30 +0000 2015","id":624284214704390145,"id_str":"624284214704390145","text":"when your fragrance is ud83dudd25#madpotion https://t.co/UfyPQIwIj4","source":"<a href="http://instagram.com" rel="nofollow">Instagram</a>","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":21447363,"id_str":"21447363","name":"KATY PERRY","screen_name":"katyperry","location":"","description":"CURRENTLYu2728BEAMINGu2728ON THE PRISMATIC WORLD TOUR 2014/2015!","url":"http://t.co/fxFJjKX30d","entities":{"url":{"urls":[{"url":"http://t.co/fxFJjKX30d","expanded_url":"http://www.katyperry.com","display_url":"katyperry.com","indices":[0,22]}]},"description":{"urls":[]}},"protected":false,"followers_count":73404466,"friends_count":157,"listed_count":143175,"created_at":"Fri Feb 20 23:45:56 +0000 2009","favourites_count":1663,"utc_offset":-28800,"time_zone":"Alaska","geo_enabled":false,"verified":true,"statuses_count":6566,"lang":"en","contributors_enabled":false,"is_translator":false,"is_translation_enabled":true,"profile_background_color":"CECFBC","profile_background_image_url":"http://pbs.twimg.com/profile_background_images/378800000168797027/kSZ-ewZo.jpeg","profile_background_image_url_https":"https://pbs.twimg.com/profile_background_images/378800000168797027/kSZ-ewZo.jpeg","profile_background_tile":false,"profile_image_url":"http://pbs.twimg.com/profile_images/609748341119844352/7dUd606e_normal.png","profile_image_url_https":"https://pbs.twimg.com/profile_images/609748341119844352/7dUd606e_normal.png","profile_banner_url":"https://pbs.twimg.com/profile_banners/21447363/1428015534","profile_link_color":"D55732","profile_sidebar_border_color":"FFFFFF","profile_sidebar_fill_color":"78C0A8","profile_text_color":"5E412F","profile_use_background_image":true,"has_extended_profile":false,"default_profile":false,"default_profile_image":false,"following":true,"follow_request_sent":false,"notifications":false},"geo":null,"coordinates":null,"place":null,"contributors":null,"is_quote_status":false,"retweet_count":5366,"favorite_count":10510,"entities":{"hashtags":[{"text":"madpotion","indices":[24,34]}],"symbols":[],"user_mentions":[],"urls":[{"url":"https://t.co/UfyPQIwIj4","expanded_url":"https://instagram.com/p/5fRc5mP-YB/","display_url":"instagram.com/p/5fRc5mP-YB/","indices":[35,58]}]},"favorited":false,"retweeted":false,"possibly_sensitive":false,"lang":"en"} 代码中显示的循环中插入一个巨大的字符串时,最后我会在表格中看到154行而不是200 。对于其他一些用户,我看到186 不是200 ,依此类推。现在,无论我为katy perry运行代码多少次,我只获得154次,而其他用户也是如此。我想知道为什么会这样?这个循环的插入过程是否因为大量的字符串插入而跳过某些行而变慢?

TwitterResp

PS:我也尝试在一个查询中添加推文基本上添加了很多VALUES(),VALUES(),..... 这也行不通。

我该如何解决这个问题?有什么建议吗?

1 个答案:

答案 0 :(得分:1)

首先,修改代码以处理MySQL错误。这肯定会给你一个暗示出了什么问题的提示。

$res = mysqli_query(getConnection(),$query);
if(false === $res) {
   echo "Insertion error: " . mysqli_error();
}

我的猜测是,你超过了TEXTBLOB类型的最大长度,无论你使用的是TwitterResp列。

TEXTBLOB类型的数据长度似乎无限制,但事实并非如此。 TEXT / BLOB最多可处理65536个字节(~64KB),而MEDIUMTEXT / MEDIUMBLOB的容量可达~16MB且LONGTEXT / {{1}最高~4GB。

请注意,这些只是类型限制,您还必须考虑连接缓冲区的大小,可用内存的数量等,这也可能导致数据截断。有关详细信息,请参阅MySQL documentation

总之,您可以尝试将列类型更改为具有更高容量的LONGBLOBMEDIUMTEXT。然而,如果还不够,我建议将数据存储在文件中,并将文件路径保存到数据库中。