Question

我有两个大熊猫数据框，它们有一些共同点。

我想确定df1中不在df2中的行（基于类似df1.x = df2.x的条件）并将其从df1中删除。

还要在df2中保持所有内容不变。

df1 = pandas.DataFrame(data = {'x' : [1, 2, 3, 4, 5], 'y' : [10, 11, 12, 13, 14]}) 
df2 = pandas.DataFrame(data = {'x' : [4, 5, 6], 'z' : [10, 13, 14]})

Answer 1

IIUC：

df1 = df1[df1['x'].isin(df2['x'])]

Answer 2

    df1 = pandas.DataFrame(data={'x': [1, 2, 3, 4, 5], 'y': [10, 11, 12, 13, 14]})
    df2 = pandas.DataFrame(data={'x': [4, 5, 6], 'z': [10, 13, 14]})
    #create blank dataframe to store values that are present in both:
    df3 = pandas.DataFrame()
    #check 'x' column of each row of each dataframe to find matches:
    for i in range (len(df1)):
        for ii in range(len(df2)):
            if df1.iloc[i]['x'] == df2.iloc[ii]['x']:
                #if there's a match, append it to df3:
                df3 = df3.append(df1.iloc[i])
    #delete df3 from df1 and rename it as df1:
    df1 = pandas.concat([df1,df3]).drop_duplicates(keep=False)
    print(df1)

输出：

     x     y
0  1.0  10.0
1  2.0  11.0
2  3.0  12.0

Answer 3

请尝试以下操作：

df = pd.merge(df1, df2, how='left', indicator='Exist')
df['Exist'] = np.where(df.Exist == 'both', True, False)
df = df[df['Exist']==True].drop(['Exist','z'], axis=1)

熊猫删除一个数据框中的行而不是另一个数据框中的行

3 个答案: