在熊猫中的同一数据框中合并列值

时间:2020-05-24 17:37:06

标签: python pandas dataframe

您好,我有一个数据框,例如

  >>> tab
   COL1                     COL2 COL3 COL4    COL5
0    G1        S_-__1Canis_lupus    A    B    SEQ1
1    G1          S_+__2Elpah_bis    C    D  SEQ4.1
2    G1       S_-__3Felis_cattus  NaN  NaN  SEQA.B
3    G1       S_-__4Felis_cattus  NaN  NaN  SEQA.B
4    G1  S-BICs_-__5Felis_cattus    E    F  SEQA.A
5    G1       S_+__6Felis_cattus  NaN  NaN  SEQA.A
6    G1       S_-__7Felis_cattus  NaN  NaN  SEQA.A
7    G1  S-BICs_-__8Felis_cattus    L    P  SEQA.B
8    G1       S_-__9Felis_cattus    K    L  SEQA.A
9    G2      S_+__10Felis_cattus    M    N  SEQA.A
10   G2       S_-__11Lupus_lupus  NaN  NaN    SEQ3

,这个想法就出现在每个COL1 groups中,以关注COL2中包含模式-BICs

然后用与包含COL3模式的值相同的COL4值填充所有NaN的{​​{1}}和COL5

示例:

第4行-BICs中有一个S-BICs_-__5Felis_cattus模式,其-BICs = COL5

SEQA.A

S _-__ 3Felis_cattus S _-__ 4Felis_cattus G1NaN中具有COL3个值,并且具有相同的{{1 }}值。然后,我将COL4的{​​{1}}和COL5值 :

COL3

COL4 L S-BICs_-__5Felis_cattus P`相同

   >>> tab
   COL1                     COL2 COL3 COL4    COL5
0    G1        S_-__1Canis_lupus    A    B    SEQ1
1    G1          S_+__2Elpah_bis    C    D  SEQ4.1
2    G1       S_-__3Felis_cattus  NaN  NaN  SEQA.B
3    G1       S_-__4Felis_cattus  NaN  NaN  SEQA.B
4    G1  S-BICs_-__5Felis_cattus    E    F  SEQA.A
5    G1       S_+__6Felis_cattus    E    F  SEQA.A
6    G1       S_-__7Felis_cattus    E    F  SEQA.A
7    G1  S-BICs_-__8Felis_cattus    L    P  SEQA.B
8    G1       S_-__9Felis_cattus    K    L  SEQA.A
9    G2      S_+__10Felis_cattus    M    N  SEQA.A
10   G2       S_-__11Lupus_lupus  NaN  NaN    SEQ3 

2 个答案:

答案 0 :(得分:2)

您可以使用where col2 str.contains模式执行此操作,以用nan重新替换所有不包含模式的行。然后用col1和col5的groupby.transform得到first(以获取非nan值(如果有))。最后,fillna原始数据如下:

tab[['COL3','COL4']] = (tab[['COL3','COL4']]
                           .fillna(tab[['COL3','COL4']]
                                      .where(tab['COL2'].str.contains('-BICs'))
                                      .groupby([tab['COL1'], tab['COL5']])
                                      .transform('first'))
                       )
print (tab)
   COL1                     COL2 COL3 COL4    COL5
0    G1        S_-__1Canis_lupus    A    B    SEQ1
1    G1          S_+__2Elpah_bis    C    D  SEQ4.1
2    G1       S_-__3Felis_cattus    L    P  SEQA.B
3    G1       S_-__4Felis_cattus    L    P  SEQA.B
4    G1  S-BICs_-__5Felis_cattus    E    F  SEQA.A
5    G1       S_+__6Felis_cattus    E    F  SEQA.A
6    G1       S_-__7Felis_cattus    E    F  SEQA.A
7    G1  S-BICs_-__8Felis_cattus    L    P  SEQA.B
8    G1       S_-__9Felis_cattus    K    L  SEQA.A
9    G2      S_+__10Felis_cattus    M    N  SEQA.A
10   G2       S_-__11Lupus_lupus  NaN  NaN    SEQ3

答案 1 :(得分:0)

如果我理解正确,那该怎么办?

@commands.Cog.listener()
    async def on_message_delete(self, c):
        if(c.guild):
            if c.guild.name == "Sniper's lounge":
                if c.author.bot == True:
                    print(f"Bot: {c.author} deleted --- {c.clean_content} --- in #{c.channel.name}");
                    pass;
                else:
                    print(f"User: {c.author} deleted --- {c.clean_content} --- in #{c.channel.name}");
                    pass;
            elif c.guild.name == "New GAR":
                if c.author.bot == True:
                    print(f"Bot: {c.author} deleted --- {c.clean_content} --- in #{c.channel.name}");
                    pass;
                else:
                    print(f"User: {c.author} deleted --- {c.clean_content} --- in #{c.channel.name}");
                    pass;
        else:
            pass;

我没有尝试过,但是想法是做类似的事情。