熊猫str.split()函数

时间:2018-06-22 08:19:35

标签: python-3.x pandas

说我有一个名称不同的dataFrame,有些带有2个单词名称,有些带有1个单词名称:

 Team A
 1   Zeus Odin John Wick Jason Bourne Loki
 2   

我想得到

的结果
Team A Hero 1    Team A Hero 2    Team A Hero 3   Team A Hero 4   Team A Hero 5
    Zeus             Odin           John Wick     Jason Bourne    Loki

我该如何在正则表达式中使用pandas str.split()功能?

1 个答案:

答案 0 :(得分:1)

一种方法可能是将包含空格的英雄名称临时替换为不带空格的名称,并在使用您要使用的str.split()函数之后反转

import re
# create dictionary to assign the name of the hero with space to the one without
dict_hero = { hero: hero.replace(' ','')  for hero in HeroList if ' ' in hero}
# create the inverse of the previous dictionary, several ways but I choose this one
dict_hero_rev = { hero.replace(' ',''):hero  for hero in HeroList if ' ' in hero}
# now create the pattern and the replacement function to use in str.replace
pat = re.compile('|'.join(dict_hero.keys())) #look for the hero's name in your dict_heor keys
repl = lambda x: dict_hero[x.group()] # replace by the corresponding name in the dict_hero
# work on the column Team A
(df['Team A'].str.replace(pat, repl) #change the one with space to without
             .str.split(' ', expand=True) # split on whitespace and expand to columns
             .replace(dict_hero_rev) # replace the hero's names missing a space by the name with space
              .rename(columns={nb: 'Team A Hero {}'.format(nb+1) for nb in range(5)}))

具有类似数据框

df = pd.DataFrame({'Team A':['Zeus Odin John Wick Jason Bourne Loki',
                             'Hulk Thor Green Lantern Batman Captain America']})

                                           Team A
0           Zeus Odin John Wick Jason Bourne Loki
1  Hulk Thor Green Lantern Batman Captain America

和英雄列表

HeroList = ['Green Lantern', 'Thor', 'Hulk', 'Odin', 'Batman', 
              'Jason Bourne', 'Loki', 'John Wick', 'Zeus', 'Captain America']

然后上述方法为您提供

  Team A Hero 1 Team A Hero 2  Team A Hero 3 Team A Hero 4    Team A Hero 5
0          Zeus          Odin      John Wick  Jason Bourne             Loki
1          Hulk          Thor  Green Lantern        Batman  Captain America