dataframe列str替换为list

时间:2018-05-10 09:25:34

标签: python pandas dataframe

>>> df1
  % score  (C)  D; start name
0     one    0   0        foo
1     one    1   2        bar
2     two    2   4        foo
3   three    3   6        bar
4     two    4   8        foo
5     two    5  10        bar
6     one    6  12        foo
7   three    7  14        foo
>>> char1 = ["\s+" , "(" , ")" , "%" , ";"]
>>> char2 = ["_" , "" , "", "percent" , ""]

我有上面给出的数据帧。我想通过用char2替换char1中给出的特殊字符来重命名列名。即char1 [0]应替换为char [2]。我更喜欢使用df.columns.str.replace。如何以pythonic方式完成?

提前致谢

2 个答案:

答案 0 :(得分:1)

In [23]: char1 = [r"\s+" , r"\(" , r"\)", r"%" , r";"]

In [24]: df.columns = df.columns.to_series().replace(char1, char2, regex=True).tolist()

In [25]: df
Out[25]:
  percent_score  C   D start_name
0           one  0   0        foo
1           one  1   2        bar
2           two  2   4        foo
3         three  3   6        bar
4           two  4   8        foo
5           two  5  10        bar
6           one  6  12        foo
7         three  7  14        foo

答案 1 :(得分:1)

首先转义char1中的字符串。然后构建char1char2的映射,并将其传递给列上的pd.Series.replace

import re

char1 = [r"\s+" , r"(" , r")" , r"%" , r";"]
char2 = ["_" , "" , "", "percent" , ""]

mapping = dict(zip((re.escape(c) if '\\' not in c else c for c in char1), char2))
# this next step is similar to MaxU's solution
df.columns = df.columns.to_series().replace(mapping, regex=True)

df

  percent_score  C   D start_name
0           one  0   0        foo
1           one  1   2        bar
2           two  2   4        foo
3         three  3   6        bar
4           two  4   8        foo
5           two  5  10        bar
6           one  6  12        foo
7         three  7  14        foo