Question

从此熊猫数据框中：

UIApplication.shared.open(NSURL(string: "tel://\(9999999999)") as URL)

df = pd.DataFrame({'a': ['foo_abc', 'bar_def', 'ghi'], 'b': ['foo', 'bar', 'yah']})

我想用正则表达式从a b 0 foo_abc foo 1 bar_def bar 2 ghi yah列的字符串中删除b列中的字符串以产生

我如何用熊猫来做到这一点？

Answer 1

在列表理解中将replace与strip一起使用：

df['c'] = [a.replace(b, '').strip('_') for a, b in zip(df['a'], df['b'])]
print (df)
         a    b    c
0  foo_abc  foo  abc
1  bar_def  bar  def
2      ghi  yah  ghi

使用re.sub的解决方案：

df['c'] = [re.sub('^({}_)'.format(b), '', a) for a, b in zip(df['a'], df['b'])]

当另一列包含该子字符串时，Pandas会删除该列中的字符串子字符串

1 个答案: