使用np.where()

时间:2016-07-27 20:20:01

标签: python pandas contains assign extraction

我正在尝试将州名分配给大学名称列表:

df = pd.DataFrame({'College': pd.Series(['University of Michigan', 'University of Florida', 'Iowa State'])})
State = ['Michigan', 'Iowa']
df['State'] = np.where(df['College'].str.contains('|'.join(State)),
    'state','--')

我想替换当与州的实际名称匹配时出现的“州”值。示例:密歇根大学 - >密歇根(而不是“州”)。最终,“州”将拥有所有50个州,因此我不能为每个州名写50个“np.where”语句。

感谢您的帮助。

2 个答案:

答案 0 :(得分:3)

您可以在此使用str.extract,而不是np.where

In [290]: df['State'] = df['College'].str.extract('({})'.format('|'.join(State)), expand=True)

In [291]: df
Out[291]: 
                  College     State
0  University of Michigan  Michigan
1   University of Florida       NaN
2              Iowa State      Iowa

答案 1 :(得分:1)

States = [
            'Washington' 'Wisconsin' 'West Virginia' 'Florida' 'Wyoming'
            'New Hampshire' 'New Jersey' 'New Mexico' 'National' 'North Carolina'
            'North Dakota' 'Nebraska' 'New York' 'Rhode Island' 'Nevada' 'Guam'
            'Colorado' 'California' 'Georgia' 'Connecticut' 'Oklahoma' 'Ohio' 'Kansas'
            'South Carolina' 'Kentucky' 'Oregon' 'South Dakota' 'Delaware'
            'District of Columbia' 'Hawaii' 'Puerto Rico' 'Texas' 'Louisiana'
            'Tennessee' 'Pennsylvania' 'Virginia' 'Virgin Islands' 'Alaska' 'Alabama'
            'American Samoa' 'Arkansas' 'Vermont' 'Illinois' 'Indiana' 'Iowa'
            'Arizona' 'Idaho' 'Maine' 'Maryland' 'Massachusetts' 'Utah' 'Missouri'
            'Minnesota' 'Michigan' 'Montana' 'Northern Mariana Islands' 'Mississippi'
]

state_str = '|'.join(States)
df.update(df.College.str.extract(r'(?P<State>{})'.format(state_str), expand=True))

df

enter image description here