Question

我有一个pandas数据框，我有一个值列表。我想保留原始DF中的所有行，这些行具有属于我的值列表的特定列值。但是，我想要从中选择行的列表具有重复值。每次我再次遇到相同的值时，我想再次将具有该列值的行添加到我的新数据框中。

假设我的框架名称是：with_prot_choice_df，我的列表是：with_prot_choices

如果我发出以下命令：

with_prot_choice_df = with_df[with_df[0].isin(with_prot_choices)]

然后这只会保留一次行（就像列表中的唯一值一样）。

我不想用for循环执行此操作，因为我将多次重复此过程，这将非常耗时。任何建议将被认真考虑。感谢。

我在这里添加一个例子：

假设我的数据框是：

col1   col2
a      1
a      6
b      2
c      3
d      4

我的清单是： lst = [a，b，a，a]

我希望我的新数据框new_df为： new_df

col1   col2
a      1
a      6
b      2
a      1
a      6
a      1
a      6

Answer 1

好像你需要reindex

df.set_index('col1').reindex(lst).reset_index()
Out[224]: 
  col1  col2
0    a     1
1    b     2
2    a     1
3    a     1

更新

df.merge(pd.DataFrame({'col1':lst}).reset_index()).sort_values('index').drop('index',1)
Out[236]: 
  col1  col2
0    a     1
3    a     6
6    b     2
1    a     1
4    a     6
2    a     1
5    a     6

pandas根据重复值的列值保留行

1 个答案: