当前,我正在使用以下代码进行替换,这有点麻烦:
df1['CompanyA'] = df1['CompanyA'].str.replace('.','')
df1['CompanyA'] = df1['CompanyA'].str.replace('-','')
df1['CompanyA'] = df1['CompanyA'].str.replace(',','')
df1['CompanyA'] = df1['CompanyA'].str.replace('ltd','limited')
df1['CompanyA'] = df1['CompanyA'].str.replace('&','and')
df1['Address1A'] = df1['Address1A'].str.replace('.','')
df1['Address1A'] = df1['Address1A'].str.replace('-','')
df1['Address1A'] = df1['Address1A'].str.replace('&','and')
df1['Address1A'].str.replace(r'\brd\b', 'road')
df1['Address2A'] = df1['Address2A'].str.replace('.','')
df1['Address2A'] = df1['Address2A'].str.replace('-','')
df1['Address2A'] = df1['Address2A'].str.replace('&','and')
df1['Address2A'].str.replace(r'\brd\b', 'road')
为了便于即时更改,我的理想情况是:
df1['CompanyA'] = df1['CompanyA'].str.replace(('&','and'), ('.', ''), ('-','')....)
df1['Address1A'] = df1['Address1A'].str.replace(('&','and'), ('.', ''), ('-','')....)
df1['Address2A'] = df1['Address2A'].str.replace(('&','and'), ('.', ''), ('-','')....)
这样一来,我可以输入/更改要替换的特定列的内容,而无需调整多行代码。
这有可能吗?
答案 0 :(得分:9)
您可以创建字典并将其传递给功能replace()
,而无需多次链接或命名该功能。
replacers = {',':'','.':'','-':'','ltd':'limited'} #etc....
df1['CompanyA'] = df1['CompanyA'].replace(replacers)
答案 1 :(得分:2)
您可以链接替换项:
Shareable
答案 2 :(得分:1)
替换功能也接受值作为字典。您可以执行以下操作:
df1.replace({'CompanyA' : { '&' : 'and', '.': '' , '-': ''}},regex=True)
答案 3 :(得分:1)
您可以使用字典来映射每一列的字符:
to_replace = {'.': '',
',': '',
'foo': 'bar'
}
for k, v in to_replace.items():
df1['CompanyA'] = df1['CompanyA'].str.replace(k, v)
答案 4 :(得分:1)
您最有可能使用pd.Dataframe,所以我建议制作通用卸妆液
def remover(row, replaces):
for k,v in replacers.items():
if k in row:
row = row.replace(k, v)
return row
replacers = {',' : "",
'.':'',
'-':'',
'ltd':'limited'
}
for column in df.columns:
df[column] = df[column].apply(lambda row: remover(row, replacers))
或者您可以指定要修改的特定列名称