Question

pandas.Series.map和pandas.Series.replace似乎给出相同的结果。是否有理由互相使用？例如：

import pandas as pd
df = pd.Series(['Yes', 'No'])
df

0    Yes
1     No
dtype: object

df.replace(to_replace=['Yes', 'No'], value=[True, False])

0     True
1    False
dtype: bool

df.map({'Yes':True, 'No':False})

0     True
1    False
dtype: bool

df.replace(to_replace=['Yes', 'No'], value=[True, False]).equals(df.map({'Yes':True, 'No':False}))

True

Answer 1

这两种方法都用于替换值。

来自Series.replace文档：

用值替换to_replace中给出的值。

来自Series.map文档：

用于将Series中的每个值替换为另一个值，该值可以从函数，dict或Series派生。

它们在以下方面有所不同：

replace接受str，regex，list，dict，Series，int，float或None。 map接受字典或系列。
它们在处理空值方面有所不同。
replace在幕后使用re.sub。替换re.sub的规则是相同的。

以下面的示例为例：

In [124]: s = pd.Series([0, 1, 2, 3, 4])    
In [125]: s
Out[125]: 
0    0
1    1
2    2
3    3
4    4
dtype: int64

In [126]: s.replace({0: 5})
Out[126]: 
0    5
1    1
2    2
3    3
4    4
dtype: int64

In [129]: s.map({0: 'kitten', 1: 'puppy'}) 
Out[129]: 
0    kitten
1     puppy
2       NaN
3       NaN
4       NaN
dtype: object

您可以在s.map方法中看到，除非字典具有默认值（例如defaultdict），否则在dict中找不到的值将转换为NaN。

对于s.replace，它只是替换要替换的值，其余部分保持不变。

大熊猫中的Series.replace（）和Series.map（）之间有区别吗？

1 个答案: