I am working on following data frames, though original data frames are quite large with thousands of lines, for illustration purpose I am using much basic df.
My first df is the following :
ID value
0 3 7387
1 8 4784
2 11 675
3 21 900
And there is another huge df, say df2
x y final_id
0 -7.35 2.09 3
1 -6.00 2.76 3
2 -5.89 1.90 4
3 -4.56 2.67 5
4 -3.46 1.34 8
5 -4.67 1.23 8
6 -1.99 3.44 8
7 -5.67 2.40 11
8 -7.56 1.66 11
9 -9.00 3.12 21
10 -8.01 3.11 21
11 -7.90 3.19 22
Now, from the first df, I want to consider only "ID" column and match it's values to the "final_id" column in the second data frame(df2).
I want to create another df which contains only the filtered rows of df2, ie only the rows which contains "final_id" as 3, 8, 11, 21 (as per the "ID" column of df1).
Below would the resultant df:
x y final_id
0 -7.35 2.09 3
1 -6.00 2.76 3
2 -3.46 1.34 8
3 -4.67 1.23 8
4 -1.99 3.44 8
5 -5.67 2.40 11
6 -7.56 1.66 11
7 -9.00 3.12 21
8 -8.01 3.11 21
We can see rows 2, 3, 11 from df2 has been removed from resultant df.
Please help.
答案 0 :(得分:2)
You can use isin
to create a mask and then use the boolean mask to subset your df2
:
mask = df2["final_id"].isin(df["ID"])
print(df2[mask])
x y final_id
0 -7.35 2.09 3
1 -6.00 2.76 3
4 -3.46 1.34 8
5 -4.67 1.23 8
6 -1.99 3.44 8
7 -5.67 2.40 11
8 -7.56 1.66 11
9 -9.00 3.12 21
10 -8.01 3.11 21