使用Pandas中的函数来分割文本 - 不使用apply的原因

时间:2016-05-02 15:26:13

标签: python pandas

可以将其重写为函数吗?

df2['AB18t'] = df2['AB18'].apply(lambda x: x.split(":")[0])
df2['AB18n'] = df2['AB18'].apply(lambda x: x.split(":")[1]).astype(int)
df2['AB18n'] = np.where(df2['AB18t'] == "Ab", df2['AB18n'] ,-df2['AB18n'])
df2['AB18t'] = np.where(df2['AB18t'] == "Ab", 1 ,0)

尝试

def getTextNum(x):
    df2['AB18t'] = df2['AB18'].apply(lambda x: x.split(":")[0])
    df2['AB18n'] = df2['AB18'].apply(lambda x: x.split(":")[1]).astype(int)
    df2['AB18n'] = np.where(df2['AB18t'] == "Ab", df2['AB18n'] ,-df2['AB18n'])
    df2['AB18t'] = np.where(df2['AB18t'] == "Ab", 1 ,0)


df2['AB18'].apply(getTextNum)

...编辑内容 form1中

0     Blw:001
1      Ab:008
2      Ab:007
3      Ab:006
4      Ab:005
5      Ab:004
6      Ab:003
7      Ab:002
8      Ab:001
9     Blw:001
10     Ab:001
11    Blw:002
12    Blw:001
13     Ab:001
14    Blw:002
Name: AB18, dtype: object

窗口2 :::

0     B:Ab:048
1     B:Ab:047
2     B:Ab:046
3     B:Ab:045
4     B:Ab:044
5     B:Ab:043
6     B:Ab:042
7     B:Ab:041
8     B:Ab:040
9     B:Ab:039
10    B:Ab:038
11    B:Ab:037
12    B:Ab:036
13    B:Ab:035
14    B:Ab:034
Name: SLT, dtype: object

2 个答案:

答案 0 :(得分:1)

"#N/A"

答案 1 :(得分:1)

对我而言str.splitindexing with str合作:

print df2
    AB18    b  c     d
0   Ab:1  1.0  7  M024
1   Ab:0  2.0  9  M024
2  125:1  5.0  0  M024
3  127:0  7.0  4  M025
4  129:1  NaN  2  M024

def getTextNum(df2):
    ser = df2['AB18'].str.split(":")
    df2['AB18t'] = ser.str[0]
    df2['AB18n'] = ser.str[1].astype(int)
    df2['AB18n'] = np.where(df2['AB18t'] == "Ab", df2['AB18n'] ,-df2['AB18n'])
    df2['AB18t'] = np.where(df2['AB18t'] == "Ab", 1 ,0)
    return df2

print getTextNum(df2)

    AB18    b  c     d  AB18t  AB18n
0   Ab:1  1.0  7  M024      1      1
1   Ab:0  2.0  9  M024      1      0
2  125:1  5.0  0  M024      0     -1
3  127:0  7.0  4  M025      0      0
4  129:1  NaN  2  M024      0     -1

Vectorized methods

编辑:您可以将功能getTextNum与输入栏(Serie)一起使用(例如df2['AB18'])并返回新的DataFrame

def getTextNum(col):
    ser   = col.str.split(":")
    text  = np.where(ser.str[0] == "Ab", 1 ,0)
    num   = np.where(ser.str[0] == "Ab", ser.str[1].astype(int) ,-ser.str[1].astype(int))
    return pd.DataFrame({'Text':text,'Num':num}, columns= ['Text','Num'])

print getTextNum(df2['AB18'])
   AB18n  AB18t
0      1      1
1      0      1
2     -1      0
3      0      0
4     -1      0

df2[['AB18t', 'AB18n']] = getTextNum(df2['AB18'])

EDIT1:

更一般化的解决方案 - 我从后面开始计算列表 - 最后一列按[-1]编制索引,最后一列是[-2]

print df2
       AB18       SLT
0   Blw:001  B:Ab:048
1    Ab:008  B:Ab:047
2    Ab:007  B:Ab:046
3    Ab:006  B:Ab:045
4    Ab:005  B:Ab:044
5    Ab:004  B:Ab:043
6    Ab:003  B:Ab:042
7    Ab:002  B:Ab:041
8    Ab:001  B:Ab:040
9   Blw:001  B:Ab:039
10   Ab:001  B:Ab:038
11  Blw:002  B:Ab:037
12  Blw:001  B:Ab:036
13   Ab:001  B:Ab:035
14  Blw:002  B:Ab:034

def getTextNum(df, col):
    ser   = df[col].str.split(":")
    text  = np.where(ser.str[-2] == "Ab", 1, 0)
    num   = np.where(ser.str[-2] == "Ab", ser.str[-1].astype(int),-ser.str[-1].astype(int))
    df[df[col].name + 't'] = text
    df[df[col].name + 'n'] = num   
    return df
#parameters - name of DataFrame, name of column in DataFrame
getTextNum(df2, 'SLT')
print df2
       AB18       SLT  SLTt  SLTn
0   Blw:001  B:Ab:048     1    48
1    Ab:008  B:Ab:047     1    47
2    Ab:007  B:Ab:046     1    46
3    Ab:006  B:Ab:045     1    45
4    Ab:005  B:Ab:044     1    44
5    Ab:004  B:Ab:043     1    43
6    Ab:003  B:Ab:042     1    42
7    Ab:002  B:Ab:041     1    41
8    Ab:001  B:Ab:040     1    40
9   Blw:001  B:Ab:039     1    39
10   Ab:001  B:Ab:038     1    38
11  Blw:002  B:Ab:037     1    37
12  Blw:001  B:Ab:036     1    36
13   Ab:001  B:Ab:035     1    35
14  Blw:002  B:Ab:034     1    34
getTextNum(df2, 'AB18')
print df2
       AB18       SLT  AB18t  AB18n
0   Blw:001  B:Ab:048      0     -1
1    Ab:008  B:Ab:047      1      8
2    Ab:007  B:Ab:046      1      7
3    Ab:006  B:Ab:045      1      6
4    Ab:005  B:Ab:044      1      5
5    Ab:004  B:Ab:043      1      4
6    Ab:003  B:Ab:042      1      3
7    Ab:002  B:Ab:041      1      2
8    Ab:001  B:Ab:040      1      1
9   Blw:001  B:Ab:039      0     -1
10   Ab:001  B:Ab:038      1      1
11  Blw:002  B:Ab:037      0     -2
12  Blw:001  B:Ab:036      0     -1
13   Ab:001  B:Ab:035      1      1
14  Blw:002  B:Ab:034      0     -2