熊猫:如何打印分组依据值

时间:2018-10-09 07:41:16

标签: python pandas

我有来自Table_Record的以下数据集:

Seg_ID  Lock_ID  Code
111     100      1
222     121      2
333     341      2
444     100      1
555     100      1
666     341      2
777     554      4
888     332      5

我正在使用sql查询来找到Seg_IDs重复的Lock_ID

Select Code,Lock_ID,Seg_ID from Table_Record group by Code, Lock_ID;

Seg_ID  Lock_ID  Code
111     100      1
444     100      1
555     100      1
222     121      2
333     341      2
666     341      2
777     554      4
888     332      5

如何使用Pandas实现相同的目标?

Excepted O/P from Pandas is:

例如

Seg_ID (111,444,555) has Lock_id (1).
Seg_ID (222,333,666) has Lock_ID (2).

2 个答案:

答案 0 :(得分:2)

首先通过仅过滤duplicated值来获取所有codes,然后用boolean indexingisin过滤原始DaatFrame

codes = df.loc[df.duplicated(['Lock_ID']), 'Code'].unique()

df1 = df[df['Code'].isin(codes)]
print (df1)
   Seg_ID  Lock_ID  Code
0     111      100     1
1     222      121     2
2     333      341     2
3     444      100     1
4     555      100     1
5     666      341     2

然后将groupbyf-string s:

for k, v in df1.groupby(['Code'])['Seg_ID']:
    print (f'Seg_ID {tuple(v)} has Code ({k})')

Seg_ID (111, 444, 555) has Code (1)
Seg_ID (222, 333, 666) has Code (2)

如果要输出类似DataFrame的内容,请在apply中使用tuple

df2 = df1.groupby(['Code'])['Seg_ID'].apply(tuple).reset_index()
print (df2)
   Code           Seg_ID
0     1  (111, 444, 555)
1     2  (222, 333, 666)

答案 1 :(得分:0)

只需使用groupby。从您的代码可以理解,您想要:

 grouped= df.groupby(['Code']['LockId'])