我在使用Python3的pandas数据框中要解决一个问题。我有两个数据框-第一个是as;
ID Name Linked Model 1 Linked Model 2 Linked Model 3
0 100 A 1111.0 1112.0 NaN
1 101 B 1112.0 1113.0 1115.0
2 102 C NaN NaN NaN
3 103 D 1114.0 NaN NaN
4 104 E 1114.0 1111.0 1112.0
第二个是
Model ID Name
0 1111 A
1 1112 A,B
2 1113 C
3 1114 D
4 1115 Q
5 1116 Z
6 1117 E
7 1118 W
因此,代码应在-例如,在 Linked Model 1 列中查找值,并在第二个数据框中的 Name 列中找到相应的值,以便ID可以像结果中所示那样用名称替换;
因此,如您在结果输出中看到的,None保持为None(可以替换为numpy N / As),并且第二个数据框中的名称现在被替换为它们对应的 Model ID 第一个数据帧。
我期待听到您的解决方案!
谢谢
答案 0 :(得分:3)
初始化替换词典,然后使用df.replace
将这些ID映射到Names。
m = df2.set_index('Model ID')['Name'].to_dict()
v = df.filter(like='Linked Model')
df[v.columns] = v.replace(m)
df
ID Name Linked Model 1 Linked Model 2 Linked Model 3
0 100 A A A,B NaN
1 101 B A,B C Q
2 102 C NaN NaN NaN
3 103 D D NaN NaN
4 104 E D A A,B
答案 1 :(得分:0)
首次尝试回答python问题,因此虽然这肯定比Coldspeed的回答更长,但使用melt
,merge
和pivot
功能对我来说更有意义。 / p>
import pandas as pd
import numpy as np
# make an object from the first dataset
df_1 = pd.DataFrame(
{"ID" : [100, 101, 102, 103, 104],
"Name" : ["A", "B", "C", "D", "E"],
"Linked Model 1" : [1111, 1112, np.nan, 1114, 1114],
"Linked Model 2" : [1112, 1113, np.nan, np.nan, 1111],
"Linked Model 3" : [np.nan, 1115, np.nan, np.nan, 1112]})
# make an object for the second data set
df_2 = pd.DataFrame(
{"Model ID" : [1111, 1112, 1113, 1114, 1115, 1116, 1117, 1118],
"Name" : ["A", "A,B", "C", "D", "Q", "Z", "E", "W"]})
# tidy the data
df_1 = pd.melt(df_1, ["ID", "Name"])
# left join the second data set
df_1 = pd.merge(df_1, df_2, how='left', left_on='value', right_on='Model ID').reset_index()
#pivot the data back out to achieve the desired format
df_1 = df_1.pivot_table(index='ID',
columns='variable',
values='Name_y',
aggfunc='first',
dropna=False))
variable Linked Model 1 Linked Model 2 Linked Model 3
ID
100 A A,B NaN
101 A,B C Q
102 NaN NaN NaN
103 D NaN NaN
104 D A A,B