我正在尝试在数据框中设置一个新列,其中将包含两列匹配值的列表。 我的数据框:
capacity preferences applied
0 1 [150] []
1 3 [150, 116, 9] []
2 3 [150, 68, 55] [0, 3, 4, 5, 20, 51, 54, 55, 56, 71, 102, 105,...
3 3 [150, 116, 68] [41, 50, 92, 101, 143, 152, 194, 203]
4 3 [150, 116, 68] [12, 48, 63, 99, 114, 150, 165, 201]
5 1 [150] [39, 46, 90, 97, 141, 148, 192, 199]
6 1 [150] []
7 2 [150, 116] [36, 87, 138, 189]
8 2 [150, 116] [8, 9, 26, 42, 59, 60, 77, 93, 110, 111, 128, ...
9 3 [150, 116, 9] []
10 3 [150, 55, 111] [11, 38, 62, 89, 113, 140, 164, 191]
我尝试过:
sup_df["accepted"] = list(set(sup_df['applied']).intersection(sup_df['preferences']))
但出现错误:
TypeError: unhashable type: 'list'
并尝试:
for i, r in sup_df.iterrows():
r["accepted"] = list(set(r['applied']).intersection(r['preferences']))
但数据框没有任何反应。
答案 0 :(得分:0)
这是使用apply
的一种方法:
df['accepted'] = df.apply(lambda x: list(set(x['preferences']).intersection(x['applied'])), axis=1)
preferences applied accepted
0 [150] [] []
1 [222, 9] [222] [222]
样本数据
df = pd.DataFrame({'preferences': [[150], [222,9]],
'applied': [[], [222]]})