Question

engine = create_engine('sqlite:///nwtopology.db', echo=False)
Base = declarative_base()

class SourcetoPort(Base):
    """"""
    __tablename__ = 'source_to_port'
    id = Column(Integer, primary_key=True)
    port_no        = Column(Integer)
    src_address    = Column(String)

    #----------------------------------------------------------------------
    def __init__(self, src_address,port_no):
        """"""
        self.src_address = src_address
        self.port_no     = port_no




Session = sessionmaker(bind=engine)
session = Session()
self.mac_to_port[packet.src]=packet_in.in_port
if(self.matrix.get((packet.src,packet.dst))==None):
       self.matrix[(packet.src,packet.dst)]=0
       print "found a new flow"
       #create an databse entry with address and port
       entry = SourcetoPort(src_address=str(packet.src) , port_no=packet_in.in_port)
       #add the record to the session object
       session.add(entry)
       #add the record to the session object
       session.commit()
self.matrix[(packet.src,packet.dst)]+=1
print "incrementing flow count"
#if self.mac_to_port.get(packet.dst)!=None:
if session.query(SourcetoPort).filter_by(src_address=str(packet.dst)).count():
#do stuff if the flow information is already in the databaase.

我是python和sql炼金术和东西的新手。上面的代码是网络控制器的重要部分。每当有新的数据包进入时，就会调用上面的代码。我的问题是

if session.query(SourcetoPort).filter_by(src_address=str(packet.dst)).count():

这是正确/最有效的方式来了解src_address是否已经存在于数据库中。可以有人建议一些更好的方法。依靠正数计算似乎并不太严格。

Answer 1

夫妻建议

1）确保您的表在src_address字段上创建了索引。如果您正在使用SQLAlchemy创建架构索引，可以通过这个简单的更改添加到表定义中。（详见Describing Databases with MetaData: Indexes part of SQLAlchemy manual）

class SourcetoPort(Base):
    """"""
    __tablename__ = 'source_to_port'
    id = Column(Integer, primary_key=True)
    port_no        = Column(Integer)
    src_address    = Column(String, index=True)

2）要检查是否有任何带有src_address = str（packet.dst）的记录，还有另一种使用EXISTS的方法。因此，它不必扫描具有此类src_address的所有记录，但只要找到具有此类字段值的第一个记录就会返回结果。

if session.query(SourcetoPort).filter_by(src_address=str(packet.dst)).count():
#do stuff if the flow information is already in the databaase.

将计数查询替换为存在查询

from sqlalchemy.sql.expression import exists
if session.query(exists().where(SourcetoPort.src_address == '123')).scalar() is not None:
    #do stuff if the flow information is already in the database.

3）我不确定你通过编写这个脚本来解决你的任务。我希望你不会在每次新网络数据包到达网络接口时都启动新的python脚本。启动新的python解释器进程需要几秒钟，每秒可能有数千个数据包到达。在后一种情况下，我会考虑运行这样的网络控制器程序作为守护进程，并将所有信息缓存在内存中。另外，我将实现数据库操作以在单独的线程甚至线程池中异步运行，因此它们不会阻止主线程控制网络流。

查询sqlalchemy数据库中的数据

1 个答案: