我基于retweet_count在蜂巢中找到前10条热门推文 即具有最高retweet_count的推文将是第一个等等......
这是选举表详情
id bigint from deserializer
created_at string from deserializer
source string from deserializer
favorited boolean from deserializer
retweeted_status struct<text:string,user:struct<screen_name:string,name:string>,retweet_count:int> from deserializer
entities struct<urls:array<struct<expanded_url:string>>,user_mentions:array<struct<screen_name:string,name:string>>,hashtags:array<struct<text:string>>> from deserializer
text string from deserializer
user struct<screen_name:string,name:string,friends_count:int,followers_count:int,statuses_count:int,verified:boolean,utc_offset:int,time_zone:string,location:string> from deserializer
in_reply_to_screen_name string from deserializer
我的查询
select text
from election
where retweeted_status.retweet_count IN
(select retweeted_status.retweet_count as zz
from election
order by zz desc
limit 10);
它回复了我10次相同的推文。 (TWEET-ABC, TWEET-ABC, TWEET-ABC, 。 。 。 TWEET-ABC)
所以当我运行内部查询
时,我所做的是打破嵌套查询select retweeted_status.retweet_count as zz
from election
order by zz desc
limit 10
它返回10个不同的值(1210,1209,1208,1207,1206,...... 1201)
之后我运行外部查询
select text
from election
where retweeted_status.retweet_count
IN (1210,1209,1208,1207,1206,....1201 );
结果相同10条推文 (TWEET-ABC, TWEET-ABC, TWEET-ABC, 。 。 。 TWEET-ABC)
我的查询逻辑有什么问题?
答案 0 :(得分:0)
您应该使用id,而不是使用count。那是因为如果你有100条相同数量的推文并不重要LIMIT 10你将获得100条记录。
select text
from election
where id IN
(select id as zz
from election
order by retweeted_status.retweet_count desc
limit 10);
但仍不确定为什么会得到错误的结果。
编辑(在我的评论之后):
如果我的评论是正确的,那么您将拥有十次相同的ID。在那种情况下改为
(select distinct id as zz
from election
order by retweeted_status.retweet_count desc
limit 10);