Question

我想做一些等同于

的事情

Select x,y,z from data where f(x, Y);

f是我的自定义函数，它查看行中特定列的值并返回True或False。我尝试了以下方法：

df = df.ix[_is_detection_in_window(df['Product'], df['CreatedDate'])== True]

但是我得到了

TypeError: 'Series' objects are mutable, thus they cannot be hashed

我认为它不会迭代行。我也尝试过：

 i = 0
   for index, row in df.iterrows():
           if _is_detection_in_window(row['Product'], row['CreatedDate']):
                   print 'in range '
                   new_df.iloc[i] = row
                   i+= 1
   df = new_df

但我明白了：

IndexError: single positional indexer is out-of-bounds

Answer 1

您的功能似乎不接受df = df.iloc[_is_detection_in_window(df['Product'], df['CreatedDate']), :]，但可以使用Series进行更改：

np.vectorize

此外，您应该避免使用自v20起现在已弃用的v = np.vectorize(_is_detection_in_window) df = df.loc[v(df['Product'], df['CreatedDate'])]。

Answer 2

不确定你的函数是如何看的，但是我假设它返回的bool列表等于df中的行数：

在pandas数据框中的列上应用自定义函数

2 个答案: