Question

您好我有这个代码生成一个带有压缩字符串的txt文件，该文件将插入到postgres数据库中

def test_insert():
    str_test = '4 1 2\n 2 4 5\n'.encode('utf8')
    cmpstr = zlib.compress(str_test)
    str_test_to_write = '\\x' + cmpstr.encode('hex_codec')

    with open('outfile.txt','w') as output_file:
        output_file.write(str(1) + '|'+ str_test_to_write + '\n')
        output_file.write(str(2) + '|'+ str_test_to_write + '\n')

然后我使用命令副本将信息加载到我的表中：

time cat outfile.txt |psql teste3 -c "\copy zstr(id,zstr) from stdout with delimiter '|'"

这是我的表：

drop table if exists zstr; 
    create table zstr(
    id int, 
    zstr bytea, 
    primary key(id));

然后我想选择我的字符串，但我收到了这个错误：

>>> import psycopg2
>>> import zlib
>>> con = psycopg2.connect(host = 'X', database = 'Y', user = 'Z')
>>> con.autocommit = True
>>> cur = con.cursor()
>>> cur.execute('select * from zstr where id = 1')
>>> row = cur.fetchone()
>>> row
(1, <read-only buffer for 0x7fe19b75f270, size 41, offset 0 at 0x7fe196976f30>)
>>> a = str(row[1])
>>> q = zlib.decompress(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
zlib.error: Error -3 while decompressing data: incorrect header check

那我怎么能得到我的琴弦呢？

我想要的输出：

'4 1 2\n 2 4 5\n'

Answer 1

几乎没有理由这样做。如果值大于TOAST_TUPLE_THRESHOLD，PostgreSQL自然会使用LZ压缩文本。来自docs on TOAST

仅当要存储在表中的行值宽于TOAST_TUPLE_THRESHOLD字节（通常为2 kB）时，才会触发TOAST管理代码。 TOAST代码将压缩和/或移动字段值，直到行值短于TOAST_TUPLE_TARGET字节（通常也是2 kB）或者不能再获得增益。在UPDATE操作期间，未更改字段的值通常按原样保留;因此，如果没有任何外线值发生变化，那么具有行外值的行的更新不会产生任何TOAST成本。

它为用户透明地执行此操作。只需存储文本本身。

如何从行postgres中获取压缩二进制字符串

1 个答案: