pandas数据帧中的str.replace不平衡括号错误

时间:2017-05-19 13:10:04

标签: python pandas str-replace

我有2个数据帧。以下是数据的样子

查找 enter image description here

替换 enter image description here

我在keyword中搜索每个current_title,如果发现我将其替换为keywordLength数据框中的相应Find。以下是我的代码。

import pandas as pd
df_find = pd.read_csv(input_path_find)
df_replace = pd.read_csv(input_path_replace)

#replace
for i in range(df_replace.shape[0]):
    df_find.current_title=df_find.current_title.str.replace(df_replace.keyword.loc[i],df_replace.keywordLength.loc[i],case=False)

然而,当我执行代码时,我的错误

error                                     Traceback (most recent call last)
<ipython-input-13-134bbf2a1cb4> in <module>()
      1 for i in range(df_replace.shape[0]):
----> 2     df_find.current_title=df_find.current_title.str.replace(df_replace.keyword.loc[i],df_replace.keywordLength.loc[i],case=False)

c:\python27\lib\site-packages\pandas\core\strings.pyc in replace(self, pat, repl, n, case, flags)
   1504     def replace(self, pat, repl, n=-1, case=True, flags=0):
   1505         result = str_replace(self._data, pat, repl, n=n, case=case,
-> 1506                              flags=flags)
   1507         return self._wrap_result(result)
   1508 

c:\python27\lib\site-packages\pandas\core\strings.pyc in str_replace(arr, pat, repl, n, case, flags)
    326         if not case:
    327             flags |= re.IGNORECASE
--> 328         regex = re.compile(pat, flags=flags)
    329         n = n if n >= 0 else 0
    330 

c:\python27\lib\re.pyc in compile(pattern, flags)
    192 def compile(pattern, flags=0):
    193     "Compile a regular expression pattern, returning a pattern object."
--> 194     return _compile(pattern, flags)
    195 
    196 def purge():

c:\python27\lib\re.pyc in _compile(*key)
    249         p = sre_compile.compile(pattern, flags)
    250     except error, v:
--> 251         raise error, v # invalid expression
    252     if not bypass_cache:
    253         if len(_cache) >= _MAXCACHE:

error: unbalanced parenthesis

任何帮助?

str(df_replace.keywordLength.loc[i])的值包含任何(*)+[\个特殊字符时

编辑错误

1 个答案:

答案 0 :(得分:0)

str.replace期望正则表达式作为第一个参数。在将模式字符串传递给str.replace之前,您需要scape

import pandas as pd
import re
df_find = pd.read_csv(input_path_find)
df_replace = pd.read_csv(input_path_replace)

#replace
for i in range(df_replace.shape[0]):
        df_find.current_title = df_find.current_title.str.replace(
            re.scape(df_replace.keyword.loc[i]),
            df_replace.keywordLength.loc[i],
            case=False
        )