Question

我正在创建一个用于制作Web分析仪表盘的excel文件，而我目前的问题是：如何将国家/地区分组为不同的区域？例如：欧洲，中东，非洲，亚太地区，美洲地区

我有两个Excel文件。第一个列包括：account_id，external/internal和country_list。

第二个文件还包含国家/地区列表以及相应的地区（欧洲，中东和非洲，亚太地区等）countries，regions

我想将文件1中的country_list列与文件2中的countries列进行比较，如果值匹配，则应采用region列中的值。例如：如果country_list和countries都包含“德国”，则该值应为EMEA。

到目前为止，我开始如下：

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from pandas import ExcelWriter
from pandas import ExcelFile

accounts = pd.read_excel('accountids_with_properties.xlsx', sheetname='accountids_with_properties')
CountryGroups = pd.read_excel('country_list.xlsx', sheetname='country_list')

def groupCountry(col):
    for col in accounts.index[3]:
        if col = CountryGroups.index[0]:

Answer 1

accounts.merge(CountryGroups, how='left', left_on='country_list', right_on='countries')

如果您只对输出country：region对感兴趣，还可以：

CountryGroups[CountryGroups.countries.isin(set(accounts.country_list))]

Answer 2

谢谢您的建议。我最终以列表的形式阅读了国家文件，然后将其与更大的帐户文件进行了比较，同时还将区域附加到帐户文件的新列中。

代码如下：

#for index, row in df.iterrows():
#    print(row['c1'], row['c2'])
for index, row in accounts.iterrows():
    print(row['CountryGroups'])
    for index, entry in CountryGroups.iterrows():
        if row['accounts'] == entry['Country']:
            print(entry['Region'])
            row['Region'] = entry['Region']
            print(row)
            #return entry.index[1]

按地区分组国家

2 个答案: