Question

我有一个pandas数据框。 Df1有客户信息：

Customer_Name    Demand
John               100
Mike               200
...

还有一个字典，其中包含客户名称和客户代码之间的映射

Customer_Name    Customer_Code
John               1
Mike               2
...

我想提出一个像Df1这样的新数据框，但是使用客户代码代替名称：

Customer_Code    Demand
    1              100
    2               200
    ...

为此，我使用以下代码：

df3=data.replace({"customer_code": mapp})
Raw=data_m[['Demand','customer_code]]

它给了我正确的结果，但它很慢。我想知道是否有更有效的方法进行这种映射和转换？

Answer 1

merge应该没问题。

df = df1.merge(df2)
df
  Customer_Name  Demand  Customer_Code
0          John     100              1
1          Mike     200              2

如果您想摆脱第一列，请致电df.drop('Customer_Name', 1)：

df.drop('Customer_Name', 1)
   Demand  Customer_Code
0     100              1
1     200              2

或者，列索引：

df[['Customer_Code', 'Demand']]
   Customer_Code  Demand
0              1     100
1              2     200

或者，您可以使用df.map

df1['Customer_Code'] = df1.Customer_Name.map(\
             df2.set_index('Customer_Name').Customer_Code)

df1
  Customer_Name  Demand  Customer_Code
0          John     100              1
1          Mike     200              2

Answer 2

你说你的第二个数据集是一个字典，所以在合并这两个数据集之前你需要将它转换为DataFrame：

df2 = pd.DataFrame(dict_name, columns=['Customer_Name', 'Customer_Code'])

# Merge DataFrames and only keep Customer_Code and Demand
df3 = df1.merge(df2)[['Customer_Code', 'Demand']]

  Customer_Code    Demand
0             1       100
1             2       200

使用映射转换pandas数据框

2 个答案: