Question

我拥有的是pandas数据帧中的数据。有一列包含customer_id。这些不是唯一的ID。我有一个选定的客户ID列表（列表中没有重复值）。我想要做的是根据列表中的ID创建一个新的数据帧。我想要列表中每个id的所有行。所有数据都以str。形式读入这是我的代码：

for i in range(len(df_ALL)):
if df_ALL.loc[i,"Customer_ID"] in ID_list:
    df_Sub = df_Sub.append(df_ALL.iloc[i,:])

当我在简单的小文件上运行时，它会运行。然而，当我在真实数据上运行时，它返回一个＆＃34; KeyError：＆＃39;标签[2666]不在[index]＆＃39;＆＃34;我只使用python / pandas大约4-5个月，我试图研究这个问题的解决方案，但我找不到我理解的东西。如果有更好的方法来实现我的目标，我愿意学习它。

提前谢谢。

Answer 1

由于您未发布任何数据或代码，我将演示以下内容应如何适用于您。您可以将列表传递给isin，它将返回一个布尔索引，您可以使用它来过滤您的df，不需要循环并附加感兴趣的行。它可能对你失败了（因为你没有得到你的数据而我猜不到），因为你已经走到了尽头，或者你的索引并没有包含那个特定的标签值。

In [147]:

customer_list=['Microsoft', 'Google', 'Facebook']
df = pd.DataFrame({'Customer':['Microsoft', 'Microsoft', 'Google', 'Facebook','Google', 'Facebook', 'Apple','Apple'], 'data':np.random.randn(8)})
df
Out[147]:
    Customer      data
0  Microsoft  0.669051
1  Microsoft  0.392646
2     Google  1.534285
3   Facebook -1.204585
4     Google  1.050301
5   Facebook  0.492487
6      Apple  1.471614
7      Apple  0.762598
In [148]:

df['Customer'].isin(customer_list)
Out[148]:
0     True
1     True
2     True
3     True
4     True
5     True
6    False
7    False
Name: Customer, dtype: bool
In [149]:

df[df['Customer'].isin(customer_list)]
Out[149]:
    Customer      data
0  Microsoft  0.669051
1  Microsoft  0.392646
2     Google  1.534285
3   Facebook -1.204585
4     Google  1.050301
5   Facebook  0.492487

如果特定值与列表中的值匹配，则基于off创建新数据帧

1 个答案: