OOP python:在哪里实例化Cassandra和elasticsearch集群?

时间:2017-02-13 15:29:20

标签: python oop elasticsearch cassandra instantiation

我有一个与elasticsearch和cassandra交互很多的对象。但我不知道在哪里实例化我的Cassandra和elasticsearch会话。我应该把它放在我的"代码"中,并将会话传递给我的函数参数:

cassandra_cluster = Cluster()
session = cassandra_cluster.connect()
es = Elasticsearch()

class Article:

    document_type = "cnn_article"

    def __init__(self):
        self.author = ""
        self.url = ""
        ...

    @classmethod
    def from_crawl(cls, url):
        obj = cls()
        # Launch a crawler and fill the fields and return the object

    @classmethod
    def from_elasticseacrh(cls, elastic_search_document):
        obj = cls()
        # Read the response from elasticsearch and return the object

    def save_to_cassandra(self):
        # Save an object into cassandra
        session.execute(.....)

    def save_to_elasticsearch(self, index_name, es):
        # Save an object into elasticsearch
        es.index(index=index_name, ...)

    ...

article = Article.from_crawl("http://cnn.com/article/blabla")
article.save_to_cassandra(session)
article.save_to_elasticsearch("cnn", es)

或者我应该将我的cassandra和elasticsearch会话的实例化作为实例变量:

class Article:

    cassandra_cluster = Cluster()
    session = cassandra_cluster.connect()
    es = Elasticsearch()
    document_type = "cnn_article"

    def __init__(self):
        self.author = ""
        self.url = ""
        ...

    @classmethod
    def from_crawl(cls, url):
        obj = cls()
        # Launch a crawler and fill the fields and return the object

    @classmethod
    def from_elasticseacrh(cls, elastic_search_document):
        obj = cls()
        # Read the response from elasticsearch and return the object

    def save_to_cassandra(self):
        # Save an object into cassandra
        session.execute(.....)

    def save_to_elasticsearch(self):
        # Save an object into elasticsearch
        es.index(....)

    ...

article = Article.from_crawl("http://cnn.com/article/blabla")
article.save_to_cassandra()
article.save_to_elasticsearch()

1 个答案:

答案 0 :(得分:2)

根据他们的文档和此处的一些示例:http://www.datastax.com/dev/blog/datastax-python-driver-multiprocessing-example-for-improved-bulk-data-throughput

我会采用你的第二种方法。他们提到会话只是关闭连接的上下文管理器,他们的查询管理器将它们显示为类属性。

我认为两者都有效,但如果你想对它进行多处理,如果用后一种方法做到这一点可能会稍微容易一些。