我有一个mysql样本数据,如下表所示
main_cat| sub_cat | number | org_id
Career | school | 5 | A
Career | college | 3 | A
Career | higher | 4 | A
Job | Blr | 6 | A
Job | Hyd | 11 | A
Job | Chennai | 12 | A
Career | school | 15 | B
Career | college | 30 | B
Career | higher | 5 | B
Job | Blr | 5 | B
Career | college | 8 | C
Job | Chennai | 4 | C
我想为每个组织打印前2 main_cat
。对于每个前2 main_cat
,我想为每个组织打印前2 sub_cat
。因此,对于每个main_cat
sub_cat
和前2个main_cat
请帮帮我。
答案 0 :(得分:0)
您可以使用pandas将查询处理为read_sql的数据框:
import pandas as pd
df = pd.read_sql(connection,query)
result = df.groupby(['org_id', 'main_cat', 'sub_cat'])['number'].head(2)
变量connection
是您与数据库的连接,query
是您的SELECT
字符串。
答案 1 :(得分:0)
对于分组,python提供了itertools.groupby()
,它按给定的键功能对排序输入进行分组。
在这种情况下,输入需要先按org_id
排序,然后按main_cat
排序,然后按number
按降序排序,例如:如果您的数据列表如下所示:
data = [
['Career', 'school', 5, 'A'],
['Career', 'college', 3, 'A'],
['Career', 'higher', 4, 'A'],
['Job', 'Blr', 6, 'A'],
['Job', 'Hyd', 11, 'A'],
['Job', 'Chennai', 12, 'A'],
['Career', 'school', 15, 'B'],
['Career', 'college', 30, 'B'],
['Career', 'higher', 5, 'B'],
['Job', 'Blr', 5, 'B'],
['Career', 'college', 8, 'C'],
['Job', 'Chennai', 4, 'C']
]
然后你会这样排序:
data.sort(key = lambda x: (x[3], x[0], -x[2]))
或通过改变你的sql语句来包含ORDER BY main_cat, sub_cat, number DESC
,然后你就可以从数据库中以正确的顺序得到它。
现在,您可以使用groupby
进行分组,islice
可以限制每个分组类别的结果数量:
from itertools import groupby, islice
from operator import itemgetter
# already sorted data
data = [
['Career', 'school', 5, 'A'],
['Career', 'higher', 4, 'A'],
['Career', 'college', 3, 'A'],
['Job', 'Chennai', 12, 'A'],
['Job', 'Hyd', 11, 'A'],
['Job', 'Blr', 6, 'A'],
['Career', 'college', 30, 'B'],
['Career', 'school', 15, 'B'],
['Career', 'higher', 5, 'B'],
['Job', 'Blr', 5, 'B'],
['Career', 'college', 8, 'C'],
['Job', 'Chennai', 4, 'C']
]
data.sort(key = lambda x: (x[3], x[0], -x[2]))
for org, by_org in groupby(data, key=itemgetter(3)):
print("org:", org)
for cat, by_cat in islice(groupby(by_org, key=itemgetter(0)), 2):
print(" cat:", cat)
for subcat, by_subcat in islice(groupby(by_cat, key=itemgetter(1)), 2):
print(" subcat:", subcat, " = ", list(by_subcat))
输出:
org: A cat: Career subcat: school = [['Career', 'school', 5, 'A']] subcat: higher = [['Career', 'higher', 4, 'A']] cat: Job subcat: Chennai = [['Job', 'Chennai', 12, 'A']] subcat: Hyd = [['Job', 'Hyd', 11, 'A']] org: B cat: Career subcat: college = [['Career', 'college', 30, 'B']] subcat: school = [['Career', 'school', 15, 'B']] cat: Job subcat: Blr = [['Job', 'Blr', 5, 'B']] org: C cat: Career subcat: college = [['Career', 'college', 8, 'C']] cat: Job subcat: Chennai = [['Job', 'Chennai', 4, 'C']]