熊猫-重复的行和切片字符串

时间:2019-10-24 08:11:03

标签: python string pandas duplicates

我正在尝试根据条件在数据帧中创建重复的行。

例如,我有这个数据框。

team    student
 a      Ursula
 b      Hayfa, Martin
 c      Kato
 d      Tanek, Ava, Pyto
 e      Aiko
 f      Hunter
 g      Josiah, Derek, Uma, Nell

所需的输出:

  team  student                   name      remark
   a    Ursula                    Ursula    
   b    Hayfa, Martin             Hayfa     with Martin
   b    Hayfa, Martin             Martin    with Hayfa
   c    Kato                      Kato            
   d    Tanek, Ava, Pyto          Tanek     with Ava, Pyto
   d    Tanek, Ava, Pyto          Ava       with Tanek, Pyto
   d    Tanek, Ava, Pyto          Pyto      with Tanek, Ava
   e    Aiko                      Aiko            
   f    Hunter                    Hunter    
   g    Josiah, Derek, Uma, Nell  Josiah    with Derek, Uma, Nell
   g    Josiah, Derek, Uma, Nell  Derek     with Josiah, Uma, Nell
   g    Josiah, Derek, Uma, Nell  Uma       with Josiah, Derek, Nell
   g    Josiah, Derek, Uma, Nell  Nell      with Josiah, Derek, Uma

1 个答案:

答案 0 :(得分:0)

对于大熊猫0.25+,可以将DataFrame.explodeSeries.str.split分开,对于remark带有过滤功能的列列表理解:

s = df['student'].str.split(', ')
df = df.assign(name= s, remark = s).explode('name').reset_index(drop=True)
df['remark'] = ['with ' + ', '.join(x for x in b if x != a) 
                if len(b) > 1 
                else '' 
                for a, b in zip(df['name'], df['remark'])]
print (df)
   team                   student    name                    remark
0     a                    Ursula  Ursula                          
1     b             Hayfa, Martin   Hayfa               with Martin
2     b             Hayfa, Martin  Martin                with Hayfa
3     c                      Kato    Kato                          
4     d          Tanek, Ava, Pyto   Tanek            with Ava, Pyto
5     d          Tanek, Ava, Pyto     Ava          with Tanek, Pyto
6     d          Tanek, Ava, Pyto    Pyto           with Tanek, Ava
7     e                      Aiko    Aiko                          
8     f                    Hunter  Hunter                          
9     g  Josiah, Derek, Uma, Nell  Josiah     with Derek, Uma, Nell
10    g  Josiah, Derek, Uma, Nell   Derek    with Josiah, Uma, Nell
11    g  Josiah, Derek, Uma, Nell     Uma  with Josiah, Derek, Nell
12    g  Josiah, Derek, Uma, Nell    Nell   with Josiah, Derek, Uma