PHPCassa + Cassandra上的TFramedTransport错误

时间:2011-03-31 07:24:44

标签: cassandra thrift phpcassa

我们正在删除Cassandra中的大量记录。我们收到以下错误。当我们插入大量记录时,我们也会收到此错误:

Error performing remove on 10.130.279.40:9160: exception 'TTransportException' with message 'TSocket: timed out reading 4 bytes from 10.130.279.40:9160' in /home/zonefiles/php/thrift/transport/TSocket.php:268
    Stack trace:
    0 /home/zonefiles/php/thrift/transport/TTransport.php(87): TSocket->read(4)
    1 /home/zonefiles/php/thrift/transport/TFramedTransport.php(135): TTransport->readAll(4)
    2 /home/zonefiles/php/thrift/transport/TFramedTransport.php(102): TFramedTransport->readFrame()
    3 [internal function]: TFramedTransport->read(8192)
    4 /home/zonefiles/php/thrift/packages/cassandra/Cassandra.php(691): thrift_protocol_read_binary(Object(TBinaryProtocolAccelerated), 'cassandra_Cassa...', false)
    5 /home/zonefiles/php/thrift/packages/cassandra/Cassandra.php(664): CassandraClient->recv_remove()
    6 [internal function]: CassandraClient->remove('CUSTOMERSERVICE...', Object(cassandra_ColumnPath), 1301555573936295, 1)
    7 /home/zonefiles/php/connection.php(230): call_user_func_array(Array, Array)
    8 /home/zonefiles/php/columnfamily.php(582): ConnectionPool->call('remove', 'CUSTOMERSERVICE...', Object(cassandra_ColumnPath), 1301555573936295, 1)
    9 /home/zonefiles/php/delete.php(34): ColumnFamily->remove('CUSTOMERSERVICE...')
    10 {main}
    Error connecting to 10.130.279.40:9160: exception 'TTransportException' with message 'TSocket: timed out reading 4 bytes from 10.130.279.40:9160' in /home/zonefiles/php/thrift/transport/TSocket.php:268
    Stack trace:
    0 /home/zonefiles/php/thrift/transport/TTransport.php(87): TSocket->read(4)
    1 /home/zonefiles/php/thrift/transport/TFramedTransport.php(135): TTransport->readAll(4)
    2 /home/zonefiles/php/thrift/transport/TFramedTransport.php(102): TFramedTransport->readFrame()
    3 [internal function]: TFramedTransport->read(8192)
    4 /home/zonefiles/php/thrift/packages/cassandra/Cassandra.php(1015): thrift_protocol_read_binary(Object(TBinaryProtocolAccelerated), 'cassandra_Cassa...', false)
    5 /home/zonefiles/php/thrift/packages/cassandra/Cassandra.php(992): CassandraClient->recv_describe_version()
    6 /home/zonefiles/php/connection.php(63): CassandraClient->describe_version()
    7 /home/zonefiles/php/connection.php(163): ConnectionWrapper->__construct('CDTMain1', '10.130.279.40:9...', NULL, true, 5000, 5000)
    8 /home/zonefiles/php/connection.php(254): ConnectionPool->make_conn()
    9 /home/zonefiles/php/connection.php(241): ConnectionPool->handle_conn_failure(Object(ConnectionWrapper), 'remove', Object(TTransportException), 1)
    10 /home/zonefiles/php/columnfamily.php(582): ConnectionPool->call('remove', 'CUSTOMERSERVICE...', Object(cassandra_ColumnPath), 1301555573936295, 1)
    11 /home/zonefiles/php/delete.php(34): ColumnFamily->remove('CUSTOMERSERVICE...')
    12 {main}

以下是我们用来生成错误的PHP:

<?php
set_time_limit(2000);
require 'connection.php';
require 'columnfamily.php';
$servers[0]['host'] = 'private ip';
$servers[0]['port'] = '9160';
$conn = new Connection('Server11', $servers);
$urlFamily = new ColumnFamily($conn, 'Domain'); // ColumnFamily

$start = microtime(true);

$limit = 100000000;

$rows = $urlFamily->get_range($key_start='', $key_finish='zzzzzzzzzzzzzzz',100000000);

$num = 0;
$delCount = 0;

foreach($rows as $key => $columns) {
   // Do stuff with $key or $columns
       if (strpos($key, ' .net') !== false) {
               //echo 'deleting ' . $key . "\n";
               $urlFamily->remove($key);
               $delCount++;
       }
       if ($num++ > 100000000) break;
       //$num++;
       if ($num % 100000 == 0) echo $num . "\n";
}

$end = microtime(true);

echo $num . " total\n";
echo $delCount . ' deleted in ' . ($end - $start) . " seconds\n";
echo $delCount / ($end - $start) . " deleted per second\n";

?>

我们在Fedora 14 Laughlin和Thrift 0.5.0上运行PHP 5.3.5。

一种理论认为,这是由Cassandra无法足够快地处理命令引起的。你同意/不同意吗?你以前见过这个吗?

如果您建议删除其他方式(例如截断),当我们使用Cassandra执行其他操作时,我们如何仍然可以防止此问题发生?

2 个答案:

答案 0 :(得分:2)

那些只是日志消息,还是实际引发的异常?每次在使用不同的连接重试之前捕获此类异常时,phpcassa都会调用error_log()。基本上,这意味着您应该密切关注记录的堆栈跟踪,但您不必过于担心它们。

这些是客户端套接字超时,这意味着调用花费的时间超过默认超时5秒。为什么这些事情首先发生在很大程度上取决于Cassandra的行为方式。监控Cassandra可能是最好的起点。

答案 1 :(得分:0)

根据我的程序员的说法,我们实际上是通过将超时提升到非常高的值来解决这个问题。我们试图导入一个5GB的文件,所以我猜每次读取所需的db超过5秒。

以下是设置的具体超时:

$ send_timeout = 60000 $ recv_timeout = 60000