如何在pandas中将字符串更改为二进制列

时间:2015-10-15 08:16:01

标签: python numpy pandas

代码:

import pandas as pd
df = pd.DataFrame(columns=['home_team', 'away_team'])
df = df.append(pd.Series(['a', 'b'], index=['home_team', 'away_team']), ignore_index=True)
df = df.append(pd.Series(['d', 'c'], index=['home_team', 'away_team']), ignore_index=True)
df = df.append(pd.Series(['c', 'd'], index=['home_team', 'away_team']), ignore_index=True)
df = df.append(pd.Series(['b', 'a'], index=['home_team', 'away_team']), ignore_index=True)
print(df)

原始数据框架:

    home_team   away_team
0   a   b
1   d   c
2   c   d
3   b   a

我想将其转换为:

    bit0    bit1    bit2    bit3
0   0   0   0   1
1   1   1   1   0
2   1   0   1   1
3   0   1   0   0

"""
a:00
b:01
c:10
d:11
"""

1 个答案:

答案 0 :(得分:0)

import string
alpha = string.ascii_lowercase

dic_alpha = {ltr: alpha.index(ltr) for ltr in alpha }

to_bin = lambda i: '{0:05b}'.format(i)
dic_alpha_bin = {key : list(to_bin(val)) for key,val in dic_alpha.iteritems()}

lst_c1 = ['bit0','bit1','bit2','bit3','bit4']
lst_c2 = ['bit5','bit6','bit7','bit8','bit9']

df[lst_c1] = df['home_team'].apply(lambda x: pd.Series(dic_alpha_bin[x]))
df[lst_c2] = df['away_team'].apply(lambda x: pd.Series(dic_alpha_bin[x]))