Question

对于我的开源项目（bquery）我遇到了一个cython代码的问题，它在Python 2.7中运行得非常好，但在Python 3.x中它会抛出一个错误。有关整个代码，请参阅：https://github.com/visualfabriq/bquery/pull/66

但是要提出一个想法：代码的想法是为分组中的每个元素计算不同/唯一的值。我对两个值进行了哈希检查以确保它们是唯一的（否则我需要每个组一个哈希表，这在许多情况下可能更有效但在这里不像底层技术那样我不想运行这些值多次）。要使值唯一，我创建一个连接字符串（中间有一个分隔符），然后检查哈希表。到现在为止还挺好！在Python2中给出了完美的结果并且速度相当快。但是在Python 3中我遇到了错误。

这是代码：

cdef

    kh_str_t * table
    char * element_1
    char * element_2
    char * element_3
    int ret, size_1, size_2, size_3

v = in_buffer[i]
# index
size_1 = len(bytes(current_index)) + 1
element_1 = < char * > malloc(size_1)
strcpy(element_1, bytes(current_index))
# value
size_2 = len(str(v)) + 1
element_2 = < char * > malloc(size_2)
strcpy(element_2, bytes(v))
# combination
size_3 = size_1 + size_2 + 2
element_3 = < char * > malloc(size_3)
strcpy(element_3, element_1 + '|' + element_2)
# hash check
k = kh_get_str(table, element_3)
if k == table.n_buckets:
    # first save the new element
    k = kh_put_str(table, element_3, & ret)
    # then up the amount of values found
    out_buffer[current_index] += 1

这就是错误：

======================================================================
ERROR: test_groupby_08: Groupby's type 'count_distinct'
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/carst/venv3/lib/python3.5/site-packages/nose/case.py", line 198, in runTest
    self.test(*self.arg)
  File "/home/carst/PycharmProjects/bquery/bquery/tests/test_ctable.py", line 516, in test_groupby_08
    result_bcolz = fact_bcolz.groupby(groupby_cols, agg_list)
  File "/home/carst/PycharmProjects/bquery/bquery/ctable.py", line 226, in groupby
    bool_arr=bool_arr)
  File "/home/carst/PycharmProjects/bquery/bquery/ctable.py", line 161, in aggregate_groups
    raise e
  File "/home/carst/PycharmProjects/bquery/bquery/ctable.py", line 155, in aggregate_groups
    agg_op)
  File "bquery/ctable_ext.pyx", line 452, in bquery.ctable_ext.__pyx_fuse_2_0aggregate (bquery/ctable_ext.c:27585)
    cpdef aggregate(carray ca_input, carray ca_factor,
  File "bquery/ctable_ext.pyx", line 653, in bquery.ctable_ext.aggregate (bquery/ctable_ext.c:27107)
    strcpy(element_2, bytes(v))
TypeError: 'float' object is not iterable

我必须忽略一些非常明显的东西，但我不知道我错过了什么。任何指导或帮助将非常感谢!!!

BR

岩溶

Answer 1

在Python2.X中bytes是str的别名

>>> bytes(42.0)
'42.0'

然而，在Python3.X中，bytes有一个新的构造函数，除int或str以外的任何东西都将其视为可迭代的整数。因此，你看到的错误。

>>> help(bytes)
class bytes(object)
 |  bytes(iterable_of_ints) -> bytes
 |  bytes(string, encoding[, errors]) -> bytes
 |  bytes(bytes_or_buffer) -> immutable copy of bytes_or_buffer
 |  bytes(int) -> bytes object of size given by the parameter initialized with null bytes

解决方法是使用：

str(v).encode()

是的，它不漂亮，需要两个数据副本，但它适用于Python 2和3。

Python3中的Cython Bytes错误

1 个答案: