获取按SQLAlchemy中相同字段值分组的结果

时间:2018-06-27 10:14:35

标签: python mysql json sqlalchemy

我有3个MySQL表:companiesactivitiesassociation_company_activities association_company_activitiestable链接companiesactivities,因此它具有3个字段:1个自动递增IDcompany_id作为前键,activity_id作为前键。 我有这个查询:

SELECT
    C.id,
    C.label,
    A.name
FROM
    companies C
JOIN activities A JOIN association_company_activities S ON
    C.identifier = S.company_id AND A.identifier = S.activitiy_id
ORDER BY
    C.label

由于我使用的是python脚本,因此上面的查询与此相对应: (我还将结果作为json返回)

def search(args, items):
    args = request.args.to_dict()
    if len(args) > 0:
      for param, value in args.iteritems():
        items = [v for v in items if v.has_key(param) and v[param] == value]
    return items

A = aliased(model.Activity, name='A')
S = aliased(model.AssocCompaniesActivities, name='S')
C = aliased(model.Company, name='C')
activity_area = A.name.label("activities_area")

results = session.query(C.id, C.label, activity_area) \
                 .join(S) \
                 .join(A) \
                 .filter(C.identifier == S.company_id) \
                 .filter(A.identifier == S.activity_id) \
                 .order_by(C.label) \
                 .all()
session.close()
args = request.args.to_dict()
results = search(args, results)
return jsonify({"results": results})

这给了我这个

{
  "results": [
    {
      "activities_area": "luxury", 
      "id": "company1", 
      "label": "first company"
    }, 
    {
      "activities_area": "banks", 
      "id": "company2", 
      "label": "second company"
    }, 
    {
      "activities_area": "paper", 
      "id": "company2", 
      "label": "second company"
    }
  ]
}

我只想退回具有多个活动的公司,并以如下数组的形式获取activity_area:

{
  "results": [
    {
      "activities_area": "luxury", 
      "id": "company1", 
      "label": "first company"
    }, 
    {
      "activities_area": [
           "paper",
           "banks"
      ],
      "id": "company2", 
      "label": "second company"
    }
  ]
}

模型:

class Companies(Base):
    __tablename__ = 'companies'

    identifier = Column(Integer, primary_key=True)
    id = Column(String)
    label = Column(String)

    def __init__(self, id, label):
        self.id = id
        self.label = label
class Activities(Base):
    __tablename__ = 'activities_area'

    identifier = Column(Integer, primary_key=True)
    name = Column(String)

    def __init__(self, name):
        self.name = name
class AssocCompaniesActivities(Base):
    __tablename__ = 'assoc_companies_activities'

    identifier = Column(Integer, primary_key=True)
    company_id = Column(Integer, ForeignKey("companies.identifier"), nullable=True)
    activities_area_id = Column(Integer, ForeignKey("activities_area.identifier"), nullable=True)

    def __init__(self, company_id , activities_area_id):
        self.company_id = organization_id
        self.activities_area_id = activities_area_id

如何执行此操作?

1 个答案:

答案 0 :(得分:2)

由于您已经在使用ORM功能,因此它实际上与SQLAlchemy with multiple Many to Many Relationships类似,并且有一个tutorial on this scenario,但是有一个细微的差别,您希望为AssocCompaniesActivities使用一个类。

为使原始代码正常工作而进行的导入序言和其他设置:

from sqlalchemy import *
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.declarative import declarative_base

from sqlalchemy import create_engine
engine = create_engine('sqlite:///:memory:', echo=True)
Session = sessionmaker(bind=engine)
Base = declarative_base()

这是更新的类定义:

from sqlalchemy.orm import relationship

class Companies(Base):
    __tablename__ = 'companies'

    identifier = Column(Integer, primary_key=True)
    id = Column(String)
    label = Column(String)

    activities_area = relationship("Activities", secondary='assoc_companies_activities', back_populates='companies')

    def __init__(self, id, label):
        self.id = id 
        self.label = label

class Activities(Base):
    __tablename__ = 'activities_area'

    identifier = Column(Integer, primary_key=True)
    name = Column(String)

    companies = relationship("Companies", secondary='assoc_companies_activities', back_populates='activities_area')

    def __init__(self, name):
        self.name = name

class AssocCompaniesActivities(Base):
    __tablename__ = 'assoc_companies_activities'

    identifier = Column(Integer, primary_key=True)
    company_id = Column(Integer, ForeignKey("companies.identifier"), nullable=True)
    activities_area_id = Column(Integer, ForeignKey("activities_area.identifier"), nullable=True)
    # Should declare primary key for company_id and activities_area_id,
    # or better yet, just create a simple un-mapped table like in the docs

    def __init__(self, company_id , activities_area_id):
        self.company_id = company_id
        self.activities_area_id = activities_area_id

设置数据库会话:

session = Session()
Base.metadata.create_all(engine)

最后,添加示例数据:

company1 = Companies(id='company1', label='first company')
company2 = Companies(id='company2', label='second company')
banks_activity = Activities('banks')
luxury_activity = Activities('luxury')
paper_activity = Activities('paper')
session.add(company1)
session.add(company2)
session.add(banks_activity)
session.add(luxury_activity)
session.add(paper_activity)
session.commit()

关系(承诺提供公司和活动的ID后)

company1_luxury = AssocCompaniesActivities(company1.identifier, luxury_activity.identifier)
company2_banks = AssocCompaniesActivities(company2.identifier, banks_activity.identifier)
company2_paper = AssocCompaniesActivities(company2.identifier, paper_activity.identifier)
session.add(company1_luxury)
session.add(company2_banks)
session.add(company2_paper)
session.commit()

在更新模型之后,我们现在可以使用relationship loading techniques中的一个进行联接查询,以进行急切的加载。这是与其他问题不同的部分,并且没有太多的示例将多对多关系与这种特定技术结合在一起以创建一些联合查询。我们可以使用joinedload实现您想要的:

from sqlalchemy.orm import joinedload

companies = session.query(Companies).options(
    joinedload(Companies.activities_area).load_only('name')).all()

哪个生成此查询:

SELECT companies.identifier AS companies_identifier, companies.id AS companies_id, companies.label AS companies_label, activities_area_1.identifier AS activities_area_1_identifier, activities_area_1.name AS activities_area_1_name 
FROM companies LEFT OUTER JOIN (assoc_companies_activities AS assoc_companies_activities_1 JOIN activities_area AS activities_area_1 ON activities_area_1.identifier = assoc_companies_activities_1.activities_area_id) ON companies.identifier = assoc_companies_activities_1.company_id

最后,将结果转换为所需的数据结构:

print(json.dumps([{
    'id': c.id,
    'label': c.label,
    'activities_area': [a.name for a in c.activities_area]
} for c in companies], indent=4))

输出:

[
    {
        "id": "company1",
        "label": "first company",
        "activities_area": [
            "luxury"
        ]
    },
    {
        "id": "company2",
        "label": "second company",
        "activities_area": [
            "banks",
            "paper"
        ]
    }
]