Question

我有两个数据框，我想使用pandas语法或方法进行比较，并根据相似的密钥将较大的数据框的值更新为较小的数据框。

import numpy
import pandas as pd

temp = pd.read_csv('.\\..\\..\\test.csv')
temp2 = pd.read_excel('.\\..\\..\\main.xlsx')

lenOfFile = len(temp.iloc[:, 1])
lenOfFile2 = len(temp2.iloc[:, 1])
dict1 = {}
dict2 = {}

for i in range(lenOfFile):
    dict1[temp.iloc[i, 0]] = temp.iloc[i, 1]

for i in range(lenOfFile2):
    dict2[temp2.iloc[i, 0]] = temp2.iloc[i, 1]

for i in dict1:
    if i in dict2:
        dict1[i] = dict2[i]
    else:
        dict1[i] = "Not in dict2"

我想要和我写的一样的行为。

Answer 1

您应该放置一个最小，完整和可验证的示例。请确保将来我们仅通过粘贴到我们的IDE中就可以运行您的代码。我在这个问题上花了太多时间哈哈

import pandas as pd

temp = pd.DataFrame({'A' : [20, 4, 60, 4, 8], 'B' : [2, 4, 5, 6, 7]})
temp2 = pd.DataFrame({'A' : [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 'B' : [1, 2, 3, 10, 5, 6, 70, 8, 9, 10]})
print(temp)
print(temp2)
#     A  B
# 0  20  2
# 1   4  4
# 2  60  5
# 3   4  6
# 4   8  7

#     A   B
# 0   1   1
# 1   2   2
# 2   3   3
# 3   4  10
# 4   5   5
# 5   6   6
# 6   7  70
# 7   8   8
# 8   9   9
# 9  10  10

# Make a mapping of the values of our second mask.
mapping = dict(zip(temp2['A'], temp2['B']))

#   We apply the mapping to each row. If we find the occurence, replace, else, default.
temp['B'] = temp['A'].apply(lambda x:mapping[x] if x in mapping else 'No matching')
print(temp)
#     A            B
# 0  20  No matching
# 1   4           10
# 2  60  No matching
# 3   4           10
# 4   8            8

如何比较熊猫中的两个数据框并根据键更新值？

1 个答案: