使用熊猫数据框中的特征创建计算器

时间:2019-06-19 12:13:27

标签: python pandas

当我们将邻里,床,浴室,卧室的数量作为输入时,我想创建一个计算器来计算Airbnb房间的平均价格(这些功能已在数据集中给出) 邻里,床,卧室,浴室和价格是数据集中的特征,请帮助

2 个答案:

答案 0 :(得分:0)

如果您提供更多详细信息并提出具体问题,将会有所帮助。

可通过以下方式计算熊猫的平均价格:

import pandas as pd

df = pd.read_csv(path_to_file.csv) # assuming the file has all the relevant fields

def calculate_price(row):
    return row['price_per_room'] * row['number_of_rooms'] * row['number_of_nights']

df['price'] = df.apply(calculate_price)

average_price = df['price'].mean()

print(f"The average price is {average_price }")

## use group by to aggregate across categories

希望这会有所帮助!

答案 1 :(得分:0)

我不确定这不是您真正需要的(您应该更好地说明您的问题,添加示例数据,首选输出,您的代码...),但是groupby可能会很有用...如下所示:< / p>

df = pd.DataFrame({
    'neighbourhood' : ['nice', 'not so nice', 'nice', 'awesome', 'not so nice'],
    'room_type' : ['a', 'a', 'b', 'b', 'a']
    'beds': [7,2,1,6,6],
    'bedrooms': [3,1,1,3,2],
    'bathrooms': [2,1,1,1,1],
    'price': [220,100,125,320,125]
})

print('Mean of all prices:\n', df['price'].mean())
print('\nMean grouped by neighbourhood:\n', df.groupby(['neighborhood']).mean().price)
print('\nMean grouped by more cols:\n', df.groupby(['neighbourhood', 'beds', 'bedrooms']).mean().price) 

输出:

Mean of all prices:
 178.0

Mean grouped by neighbourhood:
 neighbourhood
awesome        320.0
nice           172.5
not so nice    112.5

Mean grouped by more cols:
 neighbourhood  beds  bedrooms
awesome         6     3           320
nice            1     1           125
                7     3           220
not so nice     2     1           100
                6     2           125

您还可以在应用分组之前过滤DataFrame,例如:

# select requested data data in loc[...] and then apply groupby
df_filtered = df.loc[(df['neighbourhood']=='nice') & (df['beds']==1)]
df_filtered.groupby('neighbourhood')['price'].mean()
# or the same on one line:
df.loc[(df['neighbourhood']=='nice') & (df['beds']==1)].groupby('neighbourhood')['price'].mean()

您的函数(来自最后的评论)可能如下所示:

def calculate_price(air_df):
    a = str(input("Enter the Neighbourhood : "))
    b = str(input("Enter the Room Type : "))
    c = float(input("Enter number of Beds : "))
    d = float(input("Enter number of Bedrooms : "))
    e = float(input("Enter number of Bathrooms : "))
    return air_df.loc[
        (air_df['neighbourhood']==a) & 
        (air_df['room_type']==b) &
        (air_df['beds']==c) &
        (air_df['bedrooms']==d) &
        (air_df['bathrooms']==e)
    ].groupby('neighbourhood')['price'].mean()