我一直在尝试测试pyarrow,并且在将嵌套字典转换为表格时遇到了问题。当我运行此代码时:
import pyarrow as pa
a = {'a':{'b':[1,2,3], 'c':[3,2,1], 'd':[2,3,1]}}
schema = pa.schema([pa.field('a', pa.struct([pa.field('b', pa.int32()), pa.field('c', pa.int32()), pa.field('d', pa.int32())]))])
pa_a = pa.Table.from_pydict(a, schema)
我回来了pyarrow.lib.ArrowTypeError: Could not convert b with type str: was expecting tuple of (key, value) pair
似乎很奇怪,如果架构无效,模式应该抱怨吗?还是我在这里错过了什么?有没有办法转换嵌套字典?
答案 0 :(得分:0)
如果所有列值都位于数组b
,c
和d
中,那么您可以简单地执行以下操作:
pa_a = pa.Table.from_pydict(
a['a'],
pa.schema([
pa.field('b', pa.int32()),
pa.field('c', pa.int32()),
pa.field('d', pa.int32()),
]),
)
print(pa_a)
# b c d
# 0 1 3 2
# 1 2 2 3
# 2 3 1 1