Question

有时候，如果我在同一个行键上写多个版本，并且在多个批处理突变中具有多个列族（每个版本都与多次写操作一起批处理）。

这是由于数据压缩引起的预期行为吗？将来会删除多余的版本吗？

Answer 1

这里的问题是，您要将两列放在批处理中的两个单独的条目中，这意味着即使它们具有相同的行，也不会自动应用。

批处理条目可以分别成功或失败，然后客户端将仅重试失败的条目。例如，如果一个条目成功而另一个超时，但后来又无声地成功，则重试“失败”的条目可能会导致您看到部分写入结果。

因此，在python中，您应该执行以下操作（改编自cloud.google.com/bigtable/docs/samples-python-hello）：

print('Writing some greetings to the table.')
greetings = ['Hello World!', 'Hello Cloud Bigtable!', 'Hello Python!']
rows = []
column1 = 'greeting1'.encode()
column1 = 'greeting2'.encode()
for i, value in enumerate(greetings):
    # Note: This example uses sequential numeric IDs for simplicity,
    # but this can result in poor performance in a production
    # application.  Since rows are stored in sorted order by key,
    # sequential keys can result in poor distribution of operations
    # across nodes.
    #
    # For more information about how to design a Bigtable schema for
    # the best performance, see the documentation:
    #
    #     https://cloud.google.com/bigtable/docs/schema-design
    row_key = 'greeting{}'.format(i).encode()
    row = table.row(row_key)

    # **Multiple calls to 'set_cell()' are allowed on the same batch
    # entry. Each entry will be applied atomically, but a separate
    # 'row' in the same batch will be applied separately even if it
    # shares its row key with another entry.**
    row.set_cell(column_family_id,
                 column1,
                 value,
                 timestamp=datetime.datetime.utcnow())
    row.set_cell(column_family_id,
                 column2,
                 value,
                 timestamp=datetime.datetime.utcnow())
    rows.append(row)
table.mutate_rows(rows)

BigTable：2个写入相同的键，但有3个版本

1 个答案: