我有两个数据框,我想使用pandas语法或方法进行比较,并根据相似的密钥将较大的数据框的值更新为较小的数据框。
import numpy
import pandas as pd
temp = pd.read_csv('.\\..\\..\\test.csv')
temp2 = pd.read_excel('.\\..\\..\\main.xlsx')
lenOfFile = len(temp.iloc[:, 1])
lenOfFile2 = len(temp2.iloc[:, 1])
dict1 = {}
dict2 = {}
for i in range(lenOfFile):
dict1[temp.iloc[i, 0]] = temp.iloc[i, 1]
for i in range(lenOfFile2):
dict2[temp2.iloc[i, 0]] = temp2.iloc[i, 1]
for i in dict1:
if i in dict2:
dict1[i] = dict2[i]
else:
dict1[i] = "Not in dict2"
我想要和我写的一样的行为。
答案 0 :(得分:0)
您应该放置一个最小,完整和可验证的示例。请确保将来我们仅通过粘贴到我们的IDE中就可以运行您的代码。我在这个问题上花了太多时间哈哈
import pandas as pd
temp = pd.DataFrame({'A' : [20, 4, 60, 4, 8], 'B' : [2, 4, 5, 6, 7]})
temp2 = pd.DataFrame({'A' : [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 'B' : [1, 2, 3, 10, 5, 6, 70, 8, 9, 10]})
print(temp)
print(temp2)
# A B
# 0 20 2
# 1 4 4
# 2 60 5
# 3 4 6
# 4 8 7
# A B
# 0 1 1
# 1 2 2
# 2 3 3
# 3 4 10
# 4 5 5
# 5 6 6
# 6 7 70
# 7 8 8
# 8 9 9
# 9 10 10
# Make a mapping of the values of our second mask.
mapping = dict(zip(temp2['A'], temp2['B']))
# We apply the mapping to each row. If we find the occurence, replace, else, default.
temp['B'] = temp['A'].apply(lambda x:mapping[x] if x in mapping else 'No matching')
print(temp)
# A B
# 0 20 No matching
# 1 4 10
# 2 60 No matching
# 3 4 10
# 4 8 8