Question

我正在尝试根据一组条件更新列中的某些行。例如，在下面的给定示例中，我试图基于一些if语句将“国家”列名称更新为较短的版本，这就是我正在使用的名称。有更好的方法吗？

energy['Country'] = energy['Country'].apply(lambda x: 'South Korea' if x=='Republic of Korea' 
                        else('United States' if x=='United States of America20' 
                        else('United Kingdom' if x=='United Kingdom of Great Britain and Northern Ireland'
                        else('Hong Kong' if x=='China, Hong Kong Special Administrative Region' 
                             else x))))

Answer 1

使用pd.Series.map

country_map = {'Republic of Korea': 'South Korea',
               'United States of America20': 'United States of America',
               'United Kingdom of Great Britain and Northern Ireland': 'United Kingdom',
               'China, Hong Kong Special Administrative Region': 'Hong Kong'}

energy['Country'] = energy['Country'].map(country_map)

Answer 2

请尽可能避免使用DataFrame.apply，这是一个隐藏的循环。考虑使用numpy.select之类的向量化处理，您可以将向量（即Numpy数组或Pandas系列）传递给方法，而不是一次将标量元素传递给方法：

energy['Country'] = np.select([energy['Country'] == 'South Korea', 
                               energy['Country'] == 'United States', 
                               energy['Country'] == 'United Kingdom', 
                               energy['Country'] == 'Hong Kong'],
                              ['Republic of Korea', 
                               'United States of America', 
                               'United Kingdom of Great Britain and Northern Ireland'
                               'China, Hong Kong Special Administrative Region'])

如何使用apply和lambda在数据框中应用多个if / else条件？

2 个答案: