将行值分隔为新列

时间:2017-03-18 21:38:55

标签: python pandas dataframe jupyter-notebook

我有这种pandas数据帧输出,列位于顶行:

Date, Team1, Team2, Map, Event
17/3/17, Misfits 16, Cloud9 4, overpass, Pro League
17/3/17, TyLoo 16, Born Of Fire 4, cache, Pro League
17/3/17, Liquid 8, Renegades 16, cbble, Proleague
17/3/17, Earnix 16, Blight 7, overpass, Proleague
17/3/17, Selfless 12, Rush 16,, inferno, Proleague

我的目标是获得这种输出:

Date, Team1, Team2, Team1 Score, Team2 Score, Map, Event
17/3/17, Misfits, Cloud9, 16, 4,overpass, Pro League
17/3/17, TyLoo, Born Of Fire, 16, 4, cache, Pro League
17/3/17, Liquid, Renegades, 8, 16, cbble, Proleague
17/3/17, Earnix, Blight, 16, 7, overpass, Proleague
17/3/17, Selfless, Rush, 12, 16, inferno, Proleague

如何将得分值与" Team1"和#34; Team2"列到一个全新的列,名为" Team1 Score"和" Team2得分"?

3 个答案:

答案 0 :(得分:2)

您可以concat使用rsplit

#get only Team columns
df1 = df.filter(like='Team')
cols = df1.columns.tolist()
df1 = pd.concat([df1[x].str.rsplit(n=1, expand=True) for x in df1], axis=1, keys=df1.columns)
df1 = df1.sort_index(level=1, axis=1).rename(columns={0: '', 1: ' Score'})
cols2 = df1.columns.map(''.join).tolist()
df1.columns = cols2
print (df1)
      Team1         Team2 Team1 Score Team2 Score
0   Misfits        Cloud9          16           4
1     TyLoo  Born Of Fire          16           4
2    Liquid     Renegades           8          16
3    Earnix        Blight          16           7
4  Selfless          Rush          12          16

#add to original df
df2 = pd.concat([df.drop(cols, axis=1), df1], axis=1)
#change order of columns
df2 = df2.reindex_axis(df.columns[:1].tolist() + cols2 + df.columns[-2:].tolist(), axis=1)
print (df2)
      Date     Team1         Team2 Team1 Score Team2 Score       Map  \
0  17/3/17   Misfits        Cloud9          16           4  overpass   
1  17/3/17     TyLoo  Born Of Fire          16           4     cache   
2  17/3/17    Liquid     Renegades           8          16     cbble   
3  17/3/17    Earnix        Blight          16           7  overpass   
4  17/3/17  Selfless          Rush          12          16   inferno   

        Event  
0  Pro League  
1  Pro League  
2   Proleague  
3   Proleague  
4   Proleague

另一种解决方案:

orig_cols = df.columns.tolist()
df1 = df.filter(like='Team')
new_cols = []
for col in df1:
    df[[col, col + ' Score']] = df1[col].str.rsplit(n=1, expand=True)
    new_cols.append(col + ' Score')
print (df)
      Date     Team1         Team2       Map       Event Team1 Score  \
0  17/3/17   Misfits        Cloud9  overpass  Pro League          16   
1  17/3/17     TyLoo  Born Of Fire     cache  Pro League          16   
2  17/3/17    Liquid     Renegades     cbble   Proleague           8   
3  17/3/17    Earnix        Blight  overpass   Proleague          16   
4  17/3/17  Selfless          Rush   inferno   Proleague          12   

  Team2 Score  
0           4  
1           4  
2          16  
3           7  
4          16 
#get position of last column in df1
splitted = df.columns.get_loc(df1.columns[-1]) + 1
#change order
df2 = df.reindex_axis(orig_cols[:splitted] + new_cols + orig_cols[splitted:], axis=1)
print (df2)
      Date     Team1         Team2 Team1 Score Team2 Score       Map  \
0  17/3/17   Misfits        Cloud9          16           4  overpass   
1  17/3/17     TyLoo  Born Of Fire          16           4     cache   
2  17/3/17    Liquid     Renegades           8          16     cbble   
3  17/3/17    Earnix        Blight          16           7  overpass   
4  17/3/17  Selfless          Rush          12          16   inferno   

        Event  
0  Pro League  
1  Pro League  
2   Proleague  
3   Proleague  
4   Proleague  

答案 1 :(得分:2)

s = df.set_index(['Date', 'Map', 'Event']).stack()
d = s.str.extract(
    '(.*)\s+(\S+)', expand=True
).rename(columns={0: '', 1: ' Score'}).unstack()
d.columns = d.columns.map('{0[1]}{0[0]}'.format)
d.reset_index()

enter image description here

答案 2 :(得分:0)

import pandas as pd
from io import StringIO

data = pd.read_csv(StringIO("""
Date, Team1, Team2, Map, Event
17/3/17, Misfits 16, Cloud9 4, overpass, Pro League
17/3/17, TyLoo 16, Born Of Fire 4, cache, Pro League
17/3/17, Liquid 8, Renegades 16, cbble, Proleague
17/3/17, Earnix 16, Blight 7, overpass, Proleague
17/3/17, Selfless 12, Rush 16, inferno, Proleague
"""
), skipinitialspace=True)

newcols = [pd.DataFrame(data["Team" + i].str.split().apply(
    lambda x: pd.Series([" ".join(x[:-1]), x[-1]])).values,
                        columns=['Team' + i, 'Team' + i + ' Score'])
           for i in ['1', '2']]
pd.concat([data[['Date', 'Map', 'Event']]] + newcols, axis=1)