Question

在python中运行消耗函数时，获取有关1GB缓冲区限制的错误。

我想使用带有wal2json插件的消耗功能从Postgres服务器读取WAL。但是，当缓冲区大小达到1GB后，内存不足。我尝试使用send_feedback函数重置WAL读取的起点，但仍然收到此错误。

主函数具有如下代码段：-

public class SerializableClass implements Serializable {

    private final String finalVariable;

    /* Constructor and other methods */

    private void readObject(ObjectInputStream iStream) throws IOException, ClassNotFoundException {
    ObjectInputStream.GetField fields = iStream.readFields();


    try {
        Field id = this.getClass().getDeclaredField("finalVariable");

        // make finalVariable non "final"
        id.setAccessible(true);
        id.set(this, fields.get("finalVariable", null));

        // make field final again
        id.setAccessible(false);
    }
    catch (IllegalAccessException | NoSuchFieldException e) {
        System.out.println(e.getClass() + " : " + e.getMessage());
    }
}

使用功能如下：-

    connect_string="dbname='" + dbname + "' host='" + host + "' user='" + user + "'"
    my_connection  = psycopg2.connect(connect_string , connection_factory = LogicalReplicationConnection)
    cur = my_connection.cursor()
    cur.drop_replication_slot('postgres2')
    cur.create_replication_slot('postgres2', output_plugin = 'wal2json')
    cur.start_replication(slot_name = 'postgres2', options = {'pretty-print' : 1,'include-xids' : 1,'include-timestamp' : 1}, decode= True)
    cur.consume_stream(consume)

此代码的输出类似于：-

    def consume(msg):
      trx_dict = json.loads(msg.payload)
      .....
      something with this trx_dict variable
      .....
      print("WAL end position: " +str(msg.wal_end))
      msg.cursor.send_feedback(write_lsn=msg.wal_end,apply_lsn=msg.wal_end,force=True)

我尝试使用send_feedback函数中的flush_lsn，write_lsn和apply_lsn参数重置WAL读取的起点。引用链接：http://initd.org/psycopg/docs/extras.html，但仍然达到1GB的缓冲区限制。有没有办法将缓冲区重置为指向我已读取WAL或将缓冲区大小增加1GB的位置？

Answer 1

该日志消息格式（DETAIL: ... CONTEXT:）表示它来自PostgreSQL，而不是Python。

PostgreSQL的内部分配限制为1 GB，其中可见的二进制blob和文本字段大小的限制为1 GB（在此处记录：https://www.postgresql.org/docs/12/limits.html），我怀疑这就是您要达到的限制（或与此相关的内容）。

它会在任何环境中发生，而不仅仅是Python。我认为您需要找到一种在PostgreSQL方面处理较少数据的方法。不过不知道如何。

详细信息：无法将包含1073741429字节的字符串缓冲区扩大412个字节

1 个答案: