如何正确使用pandas Series.map()和映射字典?

时间:2019-06-17 20:06:33

标签: python pandas dataframe dictionary

数据帧small_df看起来像这样:

> smaller_df.head()
   MSA Code  Line   RPP
0     10180   1.0  91.2
1     10180   2.0  97.4
2     10180   3.0  78.7
3     10180   4.0  93.5
4     10420   1.0  90.4
...

smaller_df.dtypes结果

MSA Code      int64
Line        float64
RPP         float64
Wages        object
dtype: object

wage_keys.head()给出:

   MSA Code  Average Wage
0     11260  94490.000000
1     21820  72080.000000
2     10180  71128.571429
3     13820  87338.396624
4     10420  76620.000000
...

wage_keys.dtypes是:

MSA Code          int64
Average Wage    float64
dtype: object

请注意,相同的“ MSA代码”在small_df中可以出现多次,而在工资密钥中则只能出现一次。

我希望将small_df中的新列“工资”设置为工资键中的相应值。

因此新表应如下所示:

   MSA Code  Line   RPP Wages
0     10180   1.0  91.2   71128.571429
1     10180   2.0  97.4   71128.571429
2     10180   3.0  78.7   71128.571429
3     10180   4.0  93.5   71128.571429
4     10420   1.0  90.4   76620.000000
...

我有以下代码通过绘制工资字典来进行映射:

wages = wage_keys.set_index('MSA Code').to_dict()
smaller_df['Wages'] = smaller_df['MSA Code'].map(wages)

问题是这样导致的:

   MSA Code  Line   RPP Wages
0     10180   1.0  91.2   NaN
1     10180   2.0  97.4   NaN
2     10180   3.0  78.7   NaN
3     10180   4.0  93.5   NaN
4     10420   1.0  90.4   NaN

很明显,我缺少了一些东西。如何获取“工资”列中的值以将其设置为工资字典(或工资_关键数据框)中正确的对应值?

1 个答案:

答案 0 :(得分:1)

您的错误在于转换为字典。你做到了,

HAVING COUNT(*) = 1 AND MIN(COST::NUMERIC) < 75;

这将导致一则dict-of-dict。你应该做的是,

df2.set_index('MSA Code').to_dict()
# {
#     "Average Wage": {
#         "10180": 71128.571429,
#         "10420": 76620.0,
#         "11260": 94490.0,
#         "13820": 87338.396624,
#         "21820": 72080.0
#     }
# }

或者,

df2.set_index('MSA Code')['Average Wage'].to_dict()
# {11260: 94490.0, 21820: 72080.0, 10180: 71128.571429, 13820: 87338.396624, 10420: 76620.0}

两者均产生df2.set_index('MSA Code')['Average Wage'] MSA Code 11260 94490.000000 21820 72080.000000 10180 71128.571429 13820 87338.396624 10420 76620.000000 Name: Average Wage, dtype: float64 理解的映射格式。现在,您的map调用会产生预期的输出:

map