熊猫数据框如何用多个替换单个列

时间:2020-07-25 14:49:53

标签: python pandas tensorflow

例如,我有一个数据框,例如:

col1 col2 col3
0    2    1

我想替换它

{0: [a,b], 1: [c,d], 2: [e, f]}

所以我想得到一个像这样的数据框:

col1 col1b col2 col2b col3 col3b
a    b     e    f     c    d

我想在转换后将这些数据输入到tensorflow中,所以如果tensorflow接受以下数据,则下面的输出也可以接受吗?

col1  col2  col3
[a,b] [e,f] [c,d]

下面是我当前的代码:

field_names = ["elo", "map", "c1", "c2", "c3", "c4", "c5", "e1", "e2", "e3", "e4", "e5", "result"]
df_train = pd.read_csv('input/match_results.csv', names=field_names, skiprows=1, usecols=range(2, 13))

for count in range(1, 6):
    str_count = str(count)
    df_train['c' + str_count] = df_train['c' + str_count].map(champ_dict)

1 个答案:

答案 0 :(得分:1)

IIUC,您可以使用.stack .map.cumcount重塑数据框和索引。

import pandas as pd
from string import ascii_lowercase

col_dict = dict(enumerate(ascii_lowercase))
map_dict = {0: ['a','b'], 1: ['c','d'], 2: ['e', 'f']}

s = df.stack().map(map_dict).explode().reset_index()
s['level_1'] = s['level_1'] +  s.groupby(['level_1','level_0']).cumcount().map(col_dict)




df_new = s.set_index(['level_0','level_1']).unstack(1).droplevel(0,1).reset_index(drop=True)

print(df_new)

level_1  col1a col1b col2a col2b col3a col3b
0           a     b     e     f     c     d