我有2个数据框,其中一列名为frames
。数据帧是关于从两个参与者同时记录的2个视频中提取的数据。由于跟踪失败,数据缺少一些帧(每个视频不同)。我想基于帧整数值df['frame']
来获取交集。
此处发布了类似的问题:Pandas - intersection of two data frames based on column entries ,但接受的答案是加入,而不是交集。
import pandas as pd
df1 = pd.DataFrame(data={'frame': [1, 2, 3]})
df2 = pd.DataFrame(data={'frame': [2, 3, 4]})
删除了不在df1['frame']
和df2['frame']
>>> print(df1)
frame
1 2
2 3
>>> print(df2)
frame
0 2
1 3
(我用df1.reset_index(drop=True)
处理完
我想到了首先得到两个数据帧的frame列的交集:
df1_idx = df1['frame']
df2_idx = df2['frame']
intersection_idx = df1_idx.intersection(df2_idx)
错误:
File "/*python_path*/site-packages/pandas/core/generic.py", line 3081, in __getattr__
return object.__getattribute__(self, name)
AttributeError: 'Series' object has no attribute 'intersection'
在获取两个数据帧中的帧索引之后,我想要做类似的事情(在dropping rows from dataframe based on a "not in" condition中提到):
df1 = df1.drop(df[~df['frame'].isin(intersection_idx)])
使用Anaconda安装pandas 0.22.0的Python 3.6.5。
答案 0 :(得分:1)
怎么样
df1[df1.frame.isin(df2.frame)]
Out:
frame
1 2
2 3
df2[df2.frame.isin(df1.frame)]
Out:
frame
0 2
1 3