如何限制python中的列表以显示每个唯一行的N条记录?

时间:2018-05-30 15:56:53

标签: python mysql sql python-3.x pymysql

我试图限制查询的mysql输出,以便只显示每种类型的前N个记录。这是我的代码:

def selectTopNactors(n):

# Create a new connection
con=connection()

# Create a cursor on the connection
cur=con.cursor()
#execute query
int(n)
sql ="""SELECT g.genre_name, a.actor_id,COUNT(mg.genre_id) as num_mov
FROM actor as a, role as r,movie as m,genre as g, movie_has_genre as mg
WHERE a.actor_id = r.actor_id AND m.movie_id = r.movie_id
      AND m.movie_id = mg.movie_id AND g.genre_id = mg.genre_id
      AND (g.genre_id, m.movie_id) IN (SELECT g.genre_id, m.movie_id
       FROM movie as m, genre as g, movie_has_genre as mg
       WHERE m.movie_id = mg.movie_id AND mg.genre_id = g.genre_id 
       ORDER BY g.genre_id)
       GROUP BY g.genre_name, a.actor_id
       ORDER BY g.genre_name, COUNT(*) desc """

cur.execute(sql)

results = cur.fetchall()


listab = []
listac = []
for row in results:
     lista = []
     lista.append(row[0])
     lista.append(row[1])
     lista.append(row[2])
     listab = tuple(lista)
     listac.append(listab)
head = ("genreName","actorId","numberOfMovies")    
listac.insert(0,head) 


print (n)
con.commit()
return listac

返回它的列表是巨大的(6000+)记录,所以我想只显示每种类型的N条记录。 返回的列表是here

1 个答案:

答案 0 :(得分:1)

在8.0之前的MySQL版本中,我们可以在精心设计的查询中使用用户定义的变量来模拟分析函数。请注意,我们依赖于无法保证的用户定义变量的行为(在MySQL参考手册中有记录)。

SELECT @rn := IF(c.genre_name=@prev_genre,@rn+1,1) AS rn
     , @prev_genre := c.genre_name                 AS genre_name
     , c.actor_id                                  AS actor_id
     , c.num_mov                                   AS num_mov
  FROM ( SELECT @prev_genre := NULL, @rn := 0 ) i
 CROSS 
  JOIN ( SELECT g.genre_name
              , a.actor_id
              , COUNT(1) AS num_mov
           FROM actor a
           JOIN role r
             ON r.actor_id = a.actor_id
           JOIN movie m
             ON m.movie_id = r.movie_id
           JOIN movie_has_genre mg
             ON mg.movie_id = m.movie_id
           JOIN genre g
             ON g.genre_id = mg.genre_id
          GROUP
             BY g.genre_name
              , a.actor_id
          ORDER
             BY g.genre_name
              , COUNT(1) DESC
              , a.actor_id
       ) c
 ORDER
    BY c.genre_name
     , c.num_mov DESC
     , c.actor_id
HAVING rn <= 4

查询末尾的文字4代表问题中的值 N

在MySQL 8.0中,我们可以使用新引入的分析函数来获得等效结果:

SELECT ROW_NUMBER() OVER(PARTITION BY c.genre_name ORDER BY c.num_mov DESC, c.actor_id)
       AS rn
     , c.genre_name                                 AS genre_name
     , c.actor_id                                   AS actor_id
     , c.num_mov                                    AS num_mov
  FROM ( SELECT g.genre_name
              , a.actor_id
              , COUNT(1) AS num_mov
           FROM actor a
           JOIN role r
             ON r.actor_id = a.actor_id
           JOIN movie m
             ON m.movie_id = r.movie_id
           JOIN movie_has_genre mg
             ON mg.movie_id = m.movie_id
           JOIN genre g
             ON g.genre_id = mg.genre_id
          GROUP
             BY g.genre_name
              , a.actor_id
          ORDER
             BY g.genre_name
              , COUNT(1) DESC
              , a.actor_id
       ) c
 ORDER
    BY c.genre_name
     , c.num_mov DESC
     , c.actor_id
HAVING rn <= 4