如何用另一个较小的熊猫数据框过滤熊猫数据框

时间:2020-09-02 18:30:04

标签: python pandas numpy dataframe

我有2个数据框,第一个看起来像这样

df1:

    MONEY    Value
0    EUR      850
1    USD      750
2    CLP        1
3    DCN        1

df2:

      Money
0      USD
1      USD
2      USD
3      USD
4      EGP
...    ...
25984  USD
25985  DCN
25986  USD
25987  CLP
25988  USD

我想删除df1中不存在的df2的“货币”值。并在df1中添加“值”列的值的任何列

  Money    Value
0      USD      720
1      USD      720
2      USD      720
3      USD      720
...    ...
25984  USD      720
25985  DCN        1
25986  USD      720
25987  CLP        1
25000  USD      720

1 个答案:

答案 0 :(得分:0)

分步进行:

df1.set_index("MONEY")["Value"]

此代码将列MONEY转换为Dataframe索引。结果是:

    print(df1)

    MONEY
    EUR    850
    USD    150
    DCN      1

df2["Money"].map(df1.set_index("MONEY")["Value"])

此代码将df2的内容映射到df1。这将返回以下内容:

    0    150.0
    1      NaN
    2    850.0
    3      NaN

  1. 现在,我们将前一列分配给df2中名为Value的新列。全部放在一起:
df2["Value"] = df2["Money"].map(df1.set_index("MONEY")["Value"])

df2现在看起来像:

     Money  Value
    0   USD  150.0
    1   GBP    NaN
    2   EUR  850.0
    3   CLP    NaN

  1. 只剩下要做一件事:删除所有具有NaN值的行:
df2.dropna(inplace=True)

整个代码示例:

import pandas as pd

# Create df1
x_1 = ["EUR", 850], ["USD", 150], ["DCN", 1]
df1 = pd.DataFrame(x_1, columns=["MONEY", "Value"])

# Create d2
x_2 = "USD", "GBP", "EUR", "CLP"
df2 = pd.DataFrame(x_2, columns=["Money"])

# Create new column in df2 called 'Value'
df2["Value"] = df2["Money"].map(df1.set_index("MONEY")["Value"])
# Drops any rows that have 'NaN' in column 'Value'
df2.dropna(inplace=True)
print(df2)

输出:

Money  Value
0   USD  150.0
2   EUR  850.0