编写SQL查询两次引用相同的表

时间:2013-12-04 13:04:14

标签: python sqlite database-design orm sqlalchemy

考虑以下表格定义

class MCastSession(Base):
  __tablename__ = 'mcast_session'
  id = Column(Integer, primary_key=True)
  ip = Column(Integer)
  port = Column(Integer)
  __table_args__ = ( UniqueConstraint('ip', 'port'), )

class Topic(Base):
  __tablename__ = 'topic'
  id = Column(Integer, primary_key=True)
  name = Column(String, unique=True)
  mcast_session_id = Column(Integer, ForeignKey('mcast_session.id'))
  mcast_session = relationship('MCastSession')

class Host(Base):
  __tablename__ = 'host'
  id = Column(Integer, primary_key=True)
  name = Column(String, unique=True)

class Subscriber(Base):
  __tablename__ = 'subscriber'
  id = Column(Integer, primary_key=True)
  topic_id = Column(Integer, ForeignKey('topic.id'))
  topic = relationship('Topic')
  host_id = Column(Integer, ForeignKey('host.id'))
  host = relationship('Host')
  __table_args__ = ( UniqueConstraint('topic_id', 'host_id'), )

Example data:
Topic Session
T1    IP1:port1
T2    IP1:port2
T3    IP1:port2
T4    IP2:port1

Topic Host
T1    H1
T2    H1
T4    H2

我想编写一个查询来获取订阅多播ip的所有主机,但不处理ip的所有主题。在上面的例子中。 H1具有T1,因此订阅IP1但没有T3也具有相同的IP1。所以查询应该返回H1。 H2处理它订阅的ips(T4)的所有主题(T4),因此H2不应该在结果中。如何编写上面的查询?

2 个答案:

答案 0 :(得分:1)

下面的查询将导致目标:

select distinct host1.name host_name
  from Subscriber Subscriber1
 inner join host host1
    on host1.id = Subscriber1.Host_Id
 inner join topic topic1
    on topic1.id = Subscriber1.Topic_Id
 inner join mcast_session mcast_session1
    on mcast_session1.id = topic1.mcast_session_id
 where (select count(*)
          from mcast_session
         where mcast_session.ip = mcast_session1.ip) !=
       (select count(*)
          from topic
         inner join mcast_session
            on topic.mcast_session_id = mcast_session.id
         where mcast_session.ip = mcast_session1.ip)

对于逻辑的解释,查询可能会有所帮助:

select  host1.name host_name,
       topic1.name topic_name,
       mcast_session1.ip,
       mcast_session1.port,
(select count(*)
          from mcast_session
         where mcast_session.ip = mcast_session1.ip) host_to_topic_registeration,
       (select count(*)
          from topic
         inner join mcast_session
            on topic.mcast_session_id = mcast_session.id
         where mcast_session.ip = mcast_session1.ip
           ) ip_topic_count
  from Subscriber Subscriber1
 inner join host host1
    on host1.id = Subscriber1.Host_Id
 inner join topic topic1
    on topic1.id = Subscriber1.Topic_Id
 inner join mcast_session mcast_session1
    on mcast_session1.id = topic1.mcast_session_id

sqlfiddle sample

答案 1 :(得分:0)

查看所需结果的另一种方法是应用以下逻辑:

  1. 对于每个IP地址,计算Topics广播的总数
  2. 每个Host计算按Topics地址分组的IP广播总数
  3. 欲望主机是那些TopicsIP个地址的数量不等于(实际上少于)IP地址总数的主机。
  4. 以下SA代码应该为您提供所需的Host个实例:

    # subquery to get number of topics per IP
    subq_ip_topics = (session.query(
            MCastSession.ip.label("mcast_session_ip"),
            func.count(Topic.id).label("num_topics")
        )
        .join(Topic)
        .group_by(MCastSession.ip)
        ).subquery().alias("ip_topics")
    
    # subquery to get number of topics per host per ip
    subq_host_ip_topics = (session.query(
            Host.id.label("host_id"),
            MCastSession.ip.label("mcast_session_ip"),
            func.count(Topic.id).label("num_topics")
        )
        .join(Subscriber)
        .join(Topic)
        .join(MCastSession)
        .group_by(Host.id, MCastSession.ip)
        ).subquery().alias("host_ip_topics")
    
    # final query: get those Hosts where results on both sub-queries do not match
    query = (session.query(Host)
            .join(subq_host_ip_topics, Host.id == subq_host_ip_topics.c.host_id)
            .join(subq_ip_topics, and_(
                subq_host_ip_topics.c.mcast_session_ip == subq_ip_topics.c.mcast_session_ip,
                subq_host_ip_topics.c.num_topics != subq_ip_topics.c.num_topics
                ))
            )
    

    生成以下SQL代码(适用于SQLite):

    SELECT  host.id AS host_id, host.name AS host_name
    
    FROM    host
    
    JOIN   (SELECT  host.id AS host_id,
                    mcast_session.ip AS mcast_session_ip,
                    count(topic.id) AS num_topics
            FROM    host
            JOIN    subscriber
                ON  host.id = subscriber.host_id
            JOIN    topic
                ON  topic.id = subscriber.topic_id
            JOIN    mcast_session
                ON  mcast_session.id = topic.mcast_session_id
            GROUP BY host.id, mcast_session.ip
           ) AS host_ip_topics
        ON  host.id = host_ip_topics.host_id
    
    JOIN   (SELECT  mcast_session.ip AS mcast_session_ip,
                    count(topic.id) AS num_topics
            FROM    mcast_session
            JOIN    topic
                ON  mcast_session.id = topic.mcast_session_id
            GROUP BY mcast_session.ip
           ) AS ip_topics
        ON  host_ip_topics.mcast_session_ip = ip_topics.mcast_session_ip
        AND host_ip_topics.num_topics != ip_topics.num_topics
    

    现在,如果要在查询中多次使用同一个表,可以使用aliased。下面的代码将返回元组(MCaseSession, NNN)的列表,其中NNN是具有相同IP的MCastSession对象的数量:

    aliased_MCastSession = aliased(MCastSession, name="MCastSession2")
    qry = session.query(\
        MCastSession, \
        func.count(aliased_MCastSession.id).label("number_with_same_ip")).\
    filter(MCastSession.ip == aliased_MCastSession.ip).\
    group_by(MCastSession)
    

    但我不需要为提出的解决方案执行此操作,因为我使用了子查询。