如何使用正则表达式替换pandas

时间:2017-08-30 18:49:44

标签: python regex pandas

我有一个数据框,并希望根据正则表达式匹配替换一些条目。这是一个玩具示例:

import pandas as pd
dfl = pd.DataFrame(np.random.randn(5,4), columns=list('ABCD'))

          A         B         C         D
0  0.995647 -0.507860  0.246656  0.400589
1 -0.149536 -0.485617 -0.132031  0.214816
2 -0.730974 -0.932630  0.625197  1.887758
3  2.812800  0.329197  0.233513  0.140899
4 -1.897268  0.072307  0.790148  0.096455

现在让我们将所有条目转换为字符串。

dfl = dfl.astype(str)

现在我想替换包含40的每个数字,比如说boat这个词。

我试过了:

dfl = dfl.replace(r'.*40.*', "boat") 

但这根本不会修改dfl

我做错了什么?

1 个答案:

答案 0 :(得分:2)

通过regex=True

dfl = dfl.replace('.*40.*', 'boat', regex=True)

详细

In [278]: dfl
Out[278]:
                 A                 B                C                D
0  -0.389710060851    0.864059364935   0.499405126285   0.457617711403
1   0.136417007517  -0.0650312534859  0.0745132664561    2.02466341236
2   0.842889708053   -0.370605269504  -0.626932398518  0.0440612725966
3  -0.403271275281    -1.37477622923  -0.499721883883   -1.55997893498
4    3.39420415568    0.152915014005   0.205876128883  -0.644183954321

In [279]: dfl = dfl.replace('.*40.*', 'boat', regex=True)

In [280]: dfl
Out[280]:
                 A                 B                C                D
0  -0.389710060851              boat             boat             boat
1   0.136417007517  -0.0650312534859  0.0745132664561    2.02466341236
2   0.842889708053   -0.370605269504  -0.626932398518             boat
3             boat    -1.37477622923  -0.499721883883   -1.55997893498
4    3.39420415568              boat   0.205876128883  -0.644183954321