Question

假设我有一个DataFrame

df = pandas.DataFrame({'a': [1,2], 'b': [3,4]}, ['foo', 'bar'])

     a  b
foo  1  3
bar  2  4

我想添加一个基于另一个Series的列：

s = pandas.Series({'foo': 10, 'baz': 20})

foo    10
baz    20
dtype: int64

如果DataFrame索引值不在Series索引中，如何将Series分配给DataFrame的列并提供默认值？

我正在寻找某种形式的东西：

df['c'] = s.withDefault(42)

这将导致以下数据框：

     a b c 
foo  1 3 10
bar  2 4 42

#Note: bar got value 42 because it's not in s

预先感谢您的考虑和答复。

Answer 1

将`map`与`get`一起使用

get具有一个可用于指定默认值的参数。

df.assign(c=df.index.map(lambda x: s.get(x, 42)))

     a  b   c
foo  1  3  10
bar  2  4  42

将`reindex`与`fill_value`一起使用

df.assign(c=s.reindex(df.index, fill_value=42))

     a  b   c
foo  1  3  10
bar  2  4  42

Answer 2

您需要在df和从s获得的数据帧之间使用join，然后在NaN中填充默认值42（在您的情况下）。

df['c'] = df.join(pandas.DataFrame(s, columns=['c']))['c'].fillna(42).astype(int)

输出：

    a   b   c
foo 1   3   10
bar 2   4   42