我有2个pandas列,一个列具有文件路径,另一列具有新文件夹名称,我正尝试使用正则表达式替换将文件夹名称替换为新文件夹名称
df['new_path'] = df.root.str.replace(r'A-[0-9]*-END', df.new_folder_name)
我收到一个错误repl必须是字符串或可调用,是否可以用对应列中的值替换匹配的正则表达式?
答案 0 :(得分:0)
您可以先编译模式,然后使用apply
:
import pandas as pd
import re
df = pd.DataFrame({"filename":[1,2,3],
"filepath":["C:/A-1-END","C:/A-12342-END","D:/A-777-END"],
"new_folder_name":["newfolder1","newfolder2","newfolder3"]})
pat = re.compile(r"A-[0-9]*-END", re.IGNORECASE)
df["new_path"] = df[["filepath","new_folder_name"]].apply(lambda x: pat.sub(repl=x[1],string=x[0]),axis=1)
结果:
filename filepath new_folder_name new_path
0 1 C:/A-1-END newfolder1 C:/newfolder1
1 2 C:/A-12342-END newfolder2 C:/newfolder2
2 3 D:/A-777-END newfolder3 D:/newfolder3