我有一个与elasticsearch和cassandra交互很多的对象。但我不知道在哪里实例化我的Cassandra和elasticsearch会话。我应该把它放在我的"代码"中,并将会话传递给我的函数参数:
cassandra_cluster = Cluster()
session = cassandra_cluster.connect()
es = Elasticsearch()
class Article:
document_type = "cnn_article"
def __init__(self):
self.author = ""
self.url = ""
...
@classmethod
def from_crawl(cls, url):
obj = cls()
# Launch a crawler and fill the fields and return the object
@classmethod
def from_elasticseacrh(cls, elastic_search_document):
obj = cls()
# Read the response from elasticsearch and return the object
def save_to_cassandra(self):
# Save an object into cassandra
session.execute(.....)
def save_to_elasticsearch(self, index_name, es):
# Save an object into elasticsearch
es.index(index=index_name, ...)
...
article = Article.from_crawl("http://cnn.com/article/blabla")
article.save_to_cassandra(session)
article.save_to_elasticsearch("cnn", es)
或者我应该将我的cassandra和elasticsearch会话的实例化作为实例变量:
class Article:
cassandra_cluster = Cluster()
session = cassandra_cluster.connect()
es = Elasticsearch()
document_type = "cnn_article"
def __init__(self):
self.author = ""
self.url = ""
...
@classmethod
def from_crawl(cls, url):
obj = cls()
# Launch a crawler and fill the fields and return the object
@classmethod
def from_elasticseacrh(cls, elastic_search_document):
obj = cls()
# Read the response from elasticsearch and return the object
def save_to_cassandra(self):
# Save an object into cassandra
session.execute(.....)
def save_to_elasticsearch(self):
# Save an object into elasticsearch
es.index(....)
...
article = Article.from_crawl("http://cnn.com/article/blabla")
article.save_to_cassandra()
article.save_to_elasticsearch()
答案 0 :(得分:2)
根据他们的文档和此处的一些示例:http://www.datastax.com/dev/blog/datastax-python-driver-multiprocessing-example-for-improved-bulk-data-throughput
我会采用你的第二种方法。他们提到会话只是关闭连接的上下文管理器,他们的查询管理器将它们显示为类属性。
我认为两者都有效,但如果你想对它进行多处理,如果用后一种方法做到这一点可能会稍微容易一些。