用Pandas中另一列的字符串替换列值的字符

时间:2017-12-07 13:39:16

标签: python pandas dataframe

name value_1
dd3   what, _ is
dd4   what, _ is

如何将value_1列中的'_'替换为名称列中的整个字符串?

value_1列的所需输出

value_1
what, dd3 is
what, dd4 is

我试过这个:

df['value_1'] = df['value_1'].apply(lambda x:x.replace("_", df['name']))

我得到了这个error :expected a string or other character buffer object

2 个答案:

答案 0 :(得分:3)

applyaxis=1一起用于按行处理:

df['value_1'] = df.apply(lambda x:x['value_1'].replace("_", x['name']), axis=1)
print (df)
  name       value_1
0  dd3  what, dd3 is
1  dd4  what, dd4 is

答案 1 :(得分:2)

更新:类似于@ jezrael的解决方案,但对于更大的数据集(矢量化方法)应该更快一点:

In [221]: df['value_1'] = (df.groupby('name')['value_1']
                             .transform(lambda x: x.str.replace('_', x.name)))

In [222]: df
Out[222]:
  name       value_1
0  dd3  what, dd3 is
1  dd4  what, dd4 is

旧回答:

你可以创建一个助手DF:

In [181]: x = df.value_1.str.split('_', expand=True)

In [192]: x
Out[192]:
        0    1
0  what,    is
1  what,    is

然后在其中插入一个新列:

In [182]: x.insert(1, 'name', df['name'])

产生:

In [194]: x
Out[194]:
        0 name    1
0  what,   dd3   is
1  what,   dd4   is

并替换原始列:

In [183]: df['value_1'] = x.sum(1)

In [184]: df
Out[184]:
  name       value_1
0  dd3  what, dd3 is
1  dd4  what, dd4 is