我有这个房地产数据:
neighborhood type_property type_negotiation price
Smallville house rent 2000
Oakville apartment for sale 100000
Smallville house for sale 115000
King Bay house for sale 250000
King Bay apartment rent 1500
Oakville apartment for sale 95000
King Bay house for sale 300000
King Bay house for sale 175000
...
我有这个groupby,它标识数据集中的哪些值是待售房屋,然后在称为df_breakdown的新数据框中为每个邻域返回这些房屋的第10和第90个百分位数和数量。结果看起来像这样:
new_df = (Stock.loc[(Stock.tipo_negocio == 'Arriendo')
& (Stock.tipo_propiedad == 'Casa')]
.groupby('comuna')
.describe(percentiles=[0.1,0.9])
['precio_uf'][['10%','90%','count']]
.rename(columns={'count':'Quantity',
'10%':'tenthpercentile',
'90%':'ninetiethpercentile'}))
neighborhood tenthpercentile ninetiethpercentile Quantity
King Bay 150000.0 275000.0 3
Smallville 99000.0 120000.0 8
Oakville 45000.0 160000.0 6
...
我现在想将此信息带回到我的原始房地产数据集,并过滤掉所有清单(如果它是针对每个邻域计算得出的百分位数在90%或10%以下的待售房屋)。例如,我希望过滤掉位于金湾(King Bay)附近的一所价格为300000的房屋,同时也过滤掉Smallville中低于45000的一所房屋。使原始数据集如下所示:
neighborhood type_property type_negotiation price
Smallville house for sale 115000
King Bay house for sale 250000
King Bay house for sale 175000
...
预先感谢您的帮助。