Question

我有一个熊猫数据框df：

pd.DataFrame({'col1': ["a", "x", "g", "y", "q", "n"],
              'col2': ["b", "f", "s", "p", "t", "c"],
              'col3': [1, 10, 1, 1, 10, 2]}
)

>
    col1  col2  col3 
0    a     b     1
1    b     f     10
2    g     s     1
3    y     p     1
4    q     t     10
5    1     0     2

我根据col3对其进行了分组：

grp = df.groupby(["col3"])
groups = grp.groups

但是结果是一个pandas.io.formats.printing.PrettyDict类型的对象。有什么方法可以将其转换为普通词典？

Answer 1

类 PrettyDict source codes 如下：

class PrettyDict(Dict[_KT, _VT]):
    """Dict extension to support abbreviated __repr__"""

    def __repr__(self) -> str:
        return pprint_thing(self)

实际上，我们可以看到 groups 是一个普通字典。

grp = df.groupby(["col3"])
groups = grp.groups

Answer 2

在您的代码中groups是一个字典，其中包含col3个唯一值作为键和值中所选行的索引

grp= df.groupby(by = "col3").groups
grp

{1: Int64Index([0, 2, 3], dtype='int64'),
 2: Int64Index([5], dtype='int64'),
 10: Int64Index([1, 4], dtype='int64')}

您可以像这样提取与这些索引相对应的'col'，'col2'值

grp_idx= df.groupby(by = "col3").groups
res = {key:df.loc[val,['col1','col2']].values for key,val in grp_idx.items()}
res

{1: array([['a', 'b'],
        ['g', 's'],
        ['y', 'p']], dtype=object),
 2: array([['n', 'c']], dtype=object),
 10: array([['x', 'f'],
        ['q', 't']], dtype=object)}

根据确切的要求，您可以将res的值进一步转换为所需的值

将熊猫PrettyDict转换为字典

2 个答案: