Question

我正在尝试为后续的AWS Lambda函数调用重用Cassandra集群会话。我已经用Java成功实现了它，但是在python中重用会话会使lambda调用超时（实际执行初始化的第一个调用就可以了）。

从CloudWatch日志中我可以看到我得到Heartbeat failed for connection。在我看来，会话无法在空闲时进行通信，并且它处于不能恢复连接的不一致状态。事实上，尝试比功能超时更长或更短idle_heartbeat_interval对结果没有任何影响。

这是我的lambda函数的结构（为简洁省略了一些代码）：

import logging
from cassandra_client import CassandraClient

logger = logging.getLogger()
logger.setLevel(logging.INFO)

#   State of the initialization phase
flag = False

#   Cassandra instance
cassandra = None

def handle_request(event, context):

    global flag, logger, cassandra

    logger.info('Function started. Flag: %s' % (str(flag), ))

    if not flag:
        logger.info('Initialization...')
        try:
            cassandra = CassandraClient()

            #   ...

            flag = True

        except Exception as e:
            logger.error('Cannot perform initialization: '+e.message)
            exit(-1)

    #   Process the request ...
    return 'OK'

为了完整起见，我就是这样建立与集群的连接：

def _connect(self, seed_nodes=default_seed_nodes, port=default_port):
    self.cluster = Cluster(seed_nodes, port=port)
    self.metadata = self.cluster.metadata
    self.session = self.cluster.connect()
    # ...

是否有一些驱动程序配置细节，我不知道的python lambda行为会阻止会话重用？

我确实认为AWS Lambda是一个非常棒的工具，但是对于执行没有多少控制可能会让某些方面感到困惑。任何建议都非常感谢，谢谢。

Answer 1

我想我可以说这个问题是由于使用Python执行环境w.r.t时lambda的不同行为造成的。 Java的。

我有时间设置一个简单的lambda函数，在Java ad Python中实现。该函数简单地生成一个线程，该线程在while循环中打印当前时间。问题是：即使在lambda函数返回后，Java实现中的线程是否会继续打印，相反，Python线程会停止吗？在两种情况下答案都是肯定的：java线程继续打印直到配置超时，而lambda函数返回时python将立即停止。

Java版本的CloudWatch日志确认：

09:55:21 START RequestId: b70e732b-e476-11e6-b2bb-e11a0dd9b311 Version: $LATEST
09:55:21 Function started: 1485510921351
09:55:21 Pre function call: 1485510921351
09:55:21 Background function: 1485510921352
09:55:21 Background function: 1485510921452
09:55:21 Background function: 1485510921552
09:55:21 Background function: 1485510921652
09:55:21 Background function: 1485510921752
09:55:21 Post function call: 1485510921852
09:55:21 Background function: 1485510921853
09:55:21 END RequestId: b70e732b-e476-11e6-b2bb-e11a0dd9b311
09:55:21 REPORT RequestId: b70e732b-e476-11e6-b2bb-e11a0dd9b311 Duration: 523.74 ms Billed Duration: 600 ms Memory Size: 256 MB Max Memory Used: 31 MB
09:55:21 Background function: 1485510921953
09:55:22 Background function: 1485510922053
...

在Python版本中：

09:01:04 START RequestId: 21ccc71e-e46f-11e6-926b-6b46f85c9c69 Version: $LATEST
09:01:04 Function started: 2017-01-27 09:01:04.189819
09:01:04 Pre function call: 2017-01-27 09:01:04.189849
09:01:04 background_call function: 2017-01-27 09:01:04.194368
09:01:04 background_call function: 2017-01-27 09:01:04.294617
09:01:04 background_call function: 2017-01-27 09:01:04.394843
09:01:04 background_call function: 2017-01-27 09:01:04.495100
09:01:04 background_call function: 2017-01-27 09:01:04.595349
09:01:04 Post function call: 2017-01-27 09:01:04.690483
09:01:04 END RequestId: 21ccc71e-e46f-11e6-926b-6b46f85c9c69
09:01:04 REPORT RequestId: 21ccc71e-e46f-11e6-926b-6b46f85c9c69 Duration: 500.99 ms Billed Duration: 600 ms Memory Size: 128 MB Max Memory Used: 8 MB

这里是两个函数的代码：

<强>的Python

import thread
import datetime
import time


def background_call():
    while True:
        print 'background_call function: %s' % (datetime.datetime.now(), )
        time.sleep(0.1)

def lambda_handler(event, context):
    print 'Function started: %s' % (datetime.datetime.now(), )

    print 'Pre function call: %s' % (datetime.datetime.now(), )
    thread.start_new_thread(background_call, ())
    time.sleep(0.5)
    print 'Post function call: %s' % (datetime.datetime.now(), )

    return 'Needs more cowbell!'

<强>爪哇

import com.amazonaws.services.lambda.runtime.*;


public class BackgroundTest implements RequestHandler<RequestClass, ResponseClass> {

    public static void main( String[] args )
    {
        System.out.println( "Hello World!" );
    }

    public ResponseClass handleRequest(RequestClass requestClass, Context context) {
        System.out.println("Function started: "+System.currentTimeMillis());
        System.out.println("Pre function call: "+System.currentTimeMillis());
        Runnable r = new Runnable() {
            public void run() {
                while(true){
                    try {
                        System.out.println("Background function: "+System.currentTimeMillis());
                        Thread.sleep(100);
                    } catch (InterruptedException e) {
                        e.printStackTrace();
                    }
                }
            }
        };
        Thread t = new Thread(r);
        t.start();
        try {
            Thread.sleep(500);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        System.out.println("Post function call: "+System.currentTimeMillis());
        return new ResponseClass("Needs more cowbell!");
    }
}

Answer 2

cassandra-driver常见问题解答中存在类似问题，其中WSGI应用程序无法与全局连接池一起使用：

根据您的应用程序进程模型，可能在创建驱动程序会话后分叉。大多数IO反应堆都没有处理这个问题，并且问题会表现为超时。 [这里] [1]

这至少让我走上正确的轨道来检查可用的连接类：事实证明cassandra.io.twistedreactor.TwistedConnection在AWS Lambda上运行良好。

总而言之，代码看起来像这样：

from cassandra.cluster import Cluster
from cassandra.io.twistedreactor import TwistedConnection
import time


SESSION = Cluster([...], connection_class=TwistedConnection).connect()


def run(event, context):
    t0 = time.time()
    x = list(SESSION.execute('SELECT * FROM keyspace.table'))  # Ensure query actually evaluated
    print('took', time.time() - t0)

您需要在您的venv中安装twisted。

我在1分钟的crontab上过夜，并且只看到了一些连接错误（一小时内最多2个），所以对整个解决方案非常满意。

此外，我还没有测试基于eventlet和gevent的连接，因为我不能让它们修补我的应用程序，我也不想编译libev到在lambda上使用。其他人可能想尝试一下。

别忘了 http://datastax.github.io/python-driver/faq.html#why-do-connections-or-io-operations-timeout-in-my-wsgi-application

AWS Lambda（python）中的Cassandra数据库会话重用

2 个答案: