下面的代码让我可以确定每个美国地区最常见的主菜和最常见的主菜准备方法。它使用从感恩节 - 2015-poll-data.csv'可在此处找到https://github.com/fivethirtyeight/data/tree/master/thanksgiving-2015。我相信pivot_table可能会提供一种更有效的方法来获取相同的信息,但我可以弄清楚如何这样做。有人可以提供任何见解吗?以下是我用来获取此信息的代码,但我觉得这不是最好(最快)的方法。
import pandas as pd
data = pd.read_csv('thanksgiving-2015-poll-data.csv', encoding="Latin-1")
regions = data['US Region'].value_counts().keys()
main_dish = data['What is typically the main dish at your Thanksgiving dinner?']
main_dish_prep = data['How is the main dish typically cooked?']
regional_entire_meal_data_rows = []
for region in regions:
is_in_region = data['US Region'] == region
most_common_regional_dish = main_dish[is_in_region].value_counts().keys().tolist()[0]
is_region_and_most_common_dish = (is_in_region) & (main_dish == most_common_regional_dish)
most_common_regional_dish_prep_type = main_dish_prep[is_region_and_most_common_dish].value_counts().keys().tolist()[0]
regional_entire_meal_data_rows.append((region, most_common_regional_dish, most_common_regional_dish_prep_type))
labels = ['US Region', 'Most Common Main Dish', 'Most Common Prep Type for Main Dish']
regional_main_dish_data = pd.DataFrame(regional_entire_meal_data_rows, columns=labels)
full_meal_message = '''\n\nThe table below shows a breakdown of the most common
full Thanksgiving meal broken down by region.\n'''
print(full_meal_message)
print(regional_main_dish_data)