在pandas中对齐DataFrame,以便数据在行中匹配

时间:2016-06-23 21:46:39

标签: python pandas alignment

我想调整数据

我现在有这样的话:

Yan     TNSeq   Kato    Eco-GeneOrth    Essential

accA    accA    accA        accA        accA    
accB    accB    accB        accB        accB    
accC    accC    accC        accC        accC    
accD    accD    accD        accD        accD    
aceF    acpP    acpP        alaS        aceF    
acpP    acpS    acpS        argA        acpP    
acpS    adk     adk         argB        acpS    

我想要的是:

Yan     TNSeq   Kato    Eco-GeneOrth    Essential

accA    accA    accA        accA        accA    
accB    accB    accB        accB        accB    
accC    accC    accC        accC        accC    
accD    accD    accD        accD        accD    
aceF    NaN     NaN         Nan         aceF    
acpP    NaN     Nan         acpP        acpP    
NaN     acpS    NaN         NaN         acpS    

我尝试过reindex并排序,但没有运气

我很无能

基本上我想要的是将前4列与Essential列对齐或排序,以便行中的数据匹配。

2 个答案:

答案 0 :(得分:1)

<强>更新

In [120]: df[df.apply(lambda x: x['Essential'] == x, axis=1)]
Out[120]:
    Yan TNSeq  Kato Eco-GeneOrth Essential
0  accA  accA  accA         accA      accA
1  accB  accB  accB         accB      accB
2  accC  accC  accC         accC      accC
3  accD  accD  accD         accD      accD
4  aceF   NaN   NaN          NaN      aceF
5  acpP   NaN   NaN          NaN      acpP
6  acpS   NaN   NaN          NaN      acpS

试试这个:

In [86]: df[df.apply(lambda x: x[0] == x, axis=1)]
Out[86]:
    Yan TNSeq  Kato Eco-GeneOrth Essential
0  accA  accA  accA         accA      accA
1  accB  accB  accB         accB      accB
2  accC  accC  accC         accC      accC
3  accD  accD  accD         accD      accD
4  aceF   NaN   NaN          NaN      aceF
5  acpP   NaN   NaN          NaN      acpP
6  acpS   NaN   NaN          NaN      acpS

数据:

In [87]: df
Out[87]:
    Yan TNSeq  Kato Eco-GeneOrth Essential
0  accA  accA  accA         accA      accA
1  accB  accB  accB         accB      accB
2  accC  accC  accC         accC      accC
3  accD  accD  accD         accD      accD
4  aceF  acpP  acpP         alaS      aceF
5  acpP  acpS  acpS         argA      acpP
6  acpS   adk   adk         argB      acpS

答案 1 :(得分:0)

IIUC使用eq并使用arg axis=0传递您的列,以针对该列创建整个df的布尔掩码:

In [49]:
df[df.eq(df['Essential'],axis=0)]

Out[49]:
    Yan TNSeq  Kato Eco-GeneOrth Essential
0  accA  accA  accA         accA      accA
1  accB  accB  accB         accB      accB
2  accC  accC  accC         accC      accC
3  accD  accD  accD         accD      accD
4  aceF   NaN   NaN          NaN      aceF
5  acpP   NaN   NaN          NaN      acpP
6  acpS   NaN   NaN          NaN      acpS