Question

我正在使用pandas df.groupby()函数将我的数据框分组到一个列上并按如下方式迭代它：

    df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'foo'],
                       'B' : ['one', 'one', 'two', 'three', 'two', 'two', 'one', 'three'],
                       'C' : np.random.randn(8),
                       'D' : np.random.randn(8)})

""" above df looks like this
     A      B         C         D
0  foo    one -0.575010 -0.271119
1  bar    one -0.130209 -0.106217
2  foo    two  0.093987 -1.351369
3  bar  three -0.403304  0.983619
4  foo    two  0.668989  0.249099
5  bar    two  1.153876  1.407159
6  foo    one  1.453793 -0.347721
7  foo  three  0.493562 -0.051688
"""

    grouped = df.groupby('A')

    for name, group in grouped:
        print(group)
        print(group['B'])

此处print(group)返回如下：

     A      B         C         D
1  bar    one -0.130209 -0.106217
3  bar  three -0.403304  0.983619
5  bar    two  1.153876  1.407159

group['B']返回

1      one
3    three
5      two

我想获得one而不是索引1或3或5，而只是one，three和two列B。

这里iloc不起作用，因为索引不是连续的，我不知道在迭代分组数据帧时会发生哪个索引。

Answer 1

将df.to_string与index=False：

一起使用

for _, g in df.groupby('A'):
     print(g['B'].to_string(index=False))

这打印出一个系列但没有附带的索引。或者，如果您需要值列表，请使用g['B'].tolist()。

Answer 2

尝试group['B'].values。

这会将系列的元素作为numpy数组返回。

如何在没有索引的情况下从分组数据帧中获取单元格值

2 个答案: