我有很多数据框,我想对所有这些框应用相同的过滤器,而不必每次都复制粘贴过滤器条件。
到目前为止,这是我的代码:
df_list_2019 = [df_spain_2019,df_amsterdam_2019, df_venice_2019, df_sicily_2019]
for data in df_list_2019:
data = data[['host_since','host_response_time','host_response_rate',
'host_acceptance_rate','host_is_superhost','host_total_listings_count',
'host_has_profile_pic','host_identity_verified',
'neighbourhood','neighbourhood_cleansed','zipcode','latitude','longitude','property_type','room_type',
'accommodates','bathrooms','bedrooms','beds','amenities','price','weekly_price',
'monthly_price','cleaning_fee','guests_included','extra_people','minimum_nights','maximum_nights',
'minimum_nights_avg_ntm','has_availability','availability_30','availability_60','availability_90',
'availability_365','number_of_reviews','number_of_reviews_ltm','review_scores_rating',
'review_scores_checkin','review_scores_communication','review_scores_location', 'review_scores_value',
'instant_bookable','is_business_travel_ready','cancellation_policy','reviews_per_month'
]]
,但不会将过滤器应用于数据框。如何更改代码来做到这一点?
谢谢
答案 0 :(得分:1)
实际上将过滤器(列选择)应用于每个DataFrame,您只需覆盖名称render()
所指向的内容就可以丢弃结果。
您需要将结果存储在某个地方,例如列表。
data
答案 1 :(得分:0)
写var = new_value
后,您不会更改原始对象,而是拥有引用新对象的变量。
如果要更改df_list_2019
中的数据帧,则必须使用inplace=True
方法。在这里,您可以使用drop
:
keep = set(['host_since','host_response_time','host_response_rate',
'host_acceptance_rate','host_is_superhost','host_total_listings_count',
'host_has_profile_pic','host_identity_verified',
'neighbourhood','neighbourhood_cleansed','zipcode','latitude','longitude','property_type','room_type',
'accommodates','bathrooms','bedrooms','beds','amenities','price','weekly_price',
'monthly_price','cleaning_fee','guests_included','extra_people','minimum_nights','maximum_nights',
'minimum_nights_avg_ntm','has_availability','availability_30','availability_60','availability_90',
'availability_365','number_of_reviews','number_of_reviews_ltm','review_scores_rating',
'review_scores_checkin','review_scores_communication','review_scores_location', 'review_scores_value',
'instant_bookable','is_business_travel_ready','cancellation_policy','reviews_per_month'
])
for data in df_list_2019:
data.drop(columns=[col for col in data.columns if col not in keep], inplace=True)
但是请注意,熊猫专家建议使用df = df. ...
惯用语而不是df...(..., inplace=True)
,因为它允许链接操作。因此,您应该问自己是否无法使用@timgeb's answer。无论如何,这应该可以满足您的要求。