Question

我有这个房地产数据：

neighborhood  type_property  type_negotiation  price
Smallville       house           rent         2000
Oakville       apartment       for sale       100000
Smallville       house         for sale       115000
King Bay         house         for sale       250000
King Bay       apartment         rent         1500
Oakville       apartment       for sale       95000
King Bay         house         for sale       300000
King Bay         house         for sale       175000
...

我有这个groupby，它标识数据集中的哪些值是待售房屋，然后在称为df_breakdown的新数据框中为每个邻域返回这些房屋的第10和第90个百分位数和数量。结果看起来像这样：

new_df  = (Stock.loc[(Stock.tipo_negocio == 'Arriendo')
             & (Stock.tipo_propiedad == 'Casa')]
      .groupby('comuna')
          .describe(percentiles=[0.1,0.9])
          ['precio_uf'][['10%','90%','count']]
          .rename(columns={'count':'Quantity',
                           '10%':'tenthpercentile',
                           '90%':'ninetiethpercentile'}))

neighborhood tenthpercentile  ninetiethpercentile  Quantity
King Bay         150000.0             275000.0         3
Smallville        99000.0             120000.0         8
Oakville          45000.0             160000.0         6
...

我现在想将此信息带回到我的原始房地产数据集，并过滤掉所有清单（如果它是针对每个邻域计算得出的百分位数在90％或10％以下的待售房屋）。例如，我希望过滤掉位于金湾（King Bay）附近的一所价格为300000的房屋，同时也过滤掉Smallville中低于45000的一所房屋。使原始数据集如下所示：

neighborhood  type_property  type_negotiation  price
Smallville       house         for sale       115000
King Bay         house         for sale       250000
King Bay         house         for sale       175000
...

预先感谢您的帮助。

如何筛选出数据框中具有不同值的条目？

0 个答案: