运行脚本以在cassandra表之间移动数据时出错

时间:2018-04-24 00:19:27

标签: python arrays cassandra blob cqlsh

我正在运行一个脚本来将内容从一个cassandra表复制到另一个cassandra表。这是脚本:

#!/usr/bin/env python2.7
import pika
import json, os
import magic
import time
from cassandra.cluster import Cluster
from cassandra.auth import PlainTextAuthProvider
from cassandra import query
from sets import Set

keyspace_from = 'source'
keyspace_to   = 'destination'
table         = 'results'
selection     = "name = 'Joan'"
cluster_ips   = ['10.0.44.21', '10.0.44.22', '10.0.44.23']
username      = 'cassandra'
password      = 'password'

ap = PlainTextAuthProvider(username=username, password=password)
cluster = Cluster(cluster_ips, auth_provider=ap)

sess_get = cluster.connect(keyspace_from)
sess_insert = cluster.connect(keyspace_to)
sess_get.row_factory = query.dict_factory 

rows = sess_get.execute("SELECT * FROM %s WHERE %s ALLOW FILTERING" % (table, selection))
i = 0
for r in rows:
        i += 1
        keys = []
        vals = []
        for k in r:
                keys.append("%s" % str(k))
                vals.append("%%(%s)s" % str(k))

        insert_stmt = "INSERT INTO %s (%s) VALUES (%s)" % (table, ",".join(keys), ",".join(vals))
        sess_insert.execute(insert_stmt, r)
        print("Copied %d" % (i))
print("===== DONE =====")
print("Copied %d entries" % i)

但是在运行脚本时出现以下错误:

Traceback (most recent call last):
  File "copy_data.py", line 37, in <module>
    sess_insert.execute(insert_stmt, r)
  File "cassandra/cluster.py", line 1998, in cassandra.cluster.Session.execute (cassandra/cluster.c:34869)
  File "cassandra/cluster.py", line 3781, in cassandra.cluster.ResponseFuture.result (cassandra/cluster.c:73073)
cassandra.protocol.ProtocolException: <Error from server: code=000a [Protocol error] message="Cannot decode string as UTF8: '8b080000096e8800ffec9a6b4fe3381780ff4a1569a4f7d502729ab6a4fdd62d65663414662917ad36abca38a7c53b8913d9ced2ee68f7b7ef71127acd3081a150baadf8407c3de73917db897ffb6a1d514dadd6578b0a1a4c145703ee5b2dcb761a8e5bb7f62c103e98822ab10ff789bb4fec8a5d6f11fc73b156692a75517dad459ad6df7bd639a824d058cf12f6258af6c7f1fe2801a5f71decdd4f6ef42406ace5621861f39934432e951e280061b5ec833ab11b84388776a35a879f4873cf8aa904a13359090e15cb888152f705f333078a2a750063589831efb0ae496522fc2070aaa5e6a531c79af7a0fb13a521bce021b4d5310fc0fc87bda91c2521ceacb0314ec2a88651242727d8474d04bb9591e08a6a1e096c6b642153d19624a9fc8ffc1fdb488881a66623e64127271215b63273ea043bd97b96be9540fd4c35acd12809d686b1799c57f4db52cf7446fd068c0641b1d2a64b5b6bc96f120dea7a495f344a00033aadb75a4e752f2d8ca9bec5113a2dcf3b8a58d6a142855fe983d65c8c94e7fd11dd0acf3b8970f2b9d20b0863cfd3110abdef4f040d39731dd76d549bc43e4056d66cfc81bc9fe1ac73d9ebfe634fc73cebb44ffae6f9bba32d5acc8cfc3a665ae5bc2e0bd50eaacd5ab371586de681b356731d8454f021aafdbc769b0dbb3e0392973060471a21da4c77f478c576348e03ced2e4314006904911467e8236bd45f304a6888c49fe4bb5525122194cdb2ff209b962afc06759cb52684e7c791251ff288df9052e375441ae5fb18b18670be8281332e7457d5f66ca9271a389ad3260796d3edcb3661395a6de57c827270be4cab2461ffe6c04f41309ed9cd5722e49044b9d719e256264ee21332ca7f5394d14a047b9b89e627ec8600f9b29923ec73d10d2592727e347b8f43789bd3eeff68ef78bf2dee15ec18dd9a78f6ba118adac6966669b546b85eb983d5dc754da1b4bbb5246b2c245e59df2448f2b85c5151c4c4e5aefd452c24df86b24db055d4be1e9a16074043f4763b31b8f12bd4289d1383d32b4acf3cbd3a39393d9826637f6ac808a518203a070d8810f39c8ccb830d6f3c87e6c355bc63df3f63b2efce8ae7007b2010629a45b364b5c8aa078cf11f01b4953a59e14e29bb33398d7b01495537d0132c4f4a4e1732ee7129bfba376f196341373c022ff415f89a763bf74ae2b54708bd13cc25f9e8ee60756dcc6d0a536abb9052bee6984e96ed2bfe5a19a5f75dbec3d0849836f04e1d0755f749d7d5c40fef0b60663f943e65e4b740b581c3297149e7c3e8114807a6e50a65a526ddd6e87686c87c0b0c0edce21003c736607d879c7fb724fadd0f1e6606f99e39dea4b11d2f88ac3ddd9b00f2c7fbfb9724a5fda493b7e8ea3202d0ef35faae7c8c057fc2fac72ed667573d261a1da25911d4b802b2e7542831e84469712c46e726243ec3cc86671aa870df77b0c33788d7add696c0ebd2202bb44b74b74db97e8d87f37d1f5013b67fafadd3183f4247bcc038da7d4873e97c17ddb57d0f76199d799df87bbfc9e76ed04915ac9eb2ba7a6c6c69ca46712ef16b0dd02b6c90b58b9c8aab34df1a74746566905df6aea28ad606dcb15acb95baea0b3ed3158dbf618ac6d7d0cbe5d177dcaf67cfaa6fe69db73b2399f38d6be3577b63db69db71bdb6731884fb0e2f03e282ec11f50c6e65cbe4a88ddc4a65f60b2fad90a21a4af2db0124b3f7ceafe3a48bf280f7aedce878fa75dcfeb47437d4725785e8f3319297cf4bcebf48bb1aa9c5e785e272791e6a6ee1548851b77cf7b7fde1d7c145c731a98a059646cde90289d7aebcb539e612bc9f99704e4e48a060914d036d7a86598dde66301352a541fa26c6a66b7fe10449e7fc8f70c70767c71dd3eef3ed1009e77c415bd09a0079a9a5b8d265ffd6974b256ae133eb7751e95d19661ef827da3debe3dc296cff2f6ed095fe2e7d7ebadb9a4f0fbbf000000ffff010000ffffdb0eac9137320000272c2731272c7b274e6f7453656e64277d2c274e6f7453656e64272c5b274e6f74496d706c656d656e746564275d2c7b274e6f7453656e64277d2c274e6f74496d706c656d656e746564272c2734616535653230383363303661656439336663613730383639343533343562353035616335376632383666613562316661356661626232623536333264663766272c32363466656365312d373662612d313165372d396163652d3030306332393336656662342c4e554c4c29'; java.nio.charset.MalformedInputException: Input length = 1">

有问题的表包含通常的text和timeuuid数据类型,但是一列的类型为blob。此列包含使用gzip压缩的大量数据。我认为这可能是错误java.nio.charset.MalformedInputException的原因。

如何解决/绕过此错误?

编辑1:

我试图强迫&#34;通过更改代码的以下部分,将blob改为utf 8:

    for k in r:
            if k == 'results':
                    keys.append('%s' % str(k))
                    vals.append('%%(%s)s' % bytearray(k,'utf8'))
            else:
                    keys.append('%s' % str(k))
                    vals.append('%%(%s)s' % str(k))

但我仍然得到同样的错误。

0 个答案:

没有答案