我只是在跟上熊猫的步伐,无法解决一个问题。我有纽约州的县列表。如果该县是5个行政区之一,我想将县名更改为New York,否则我将其保留。下面给出了这个想法,但不正确。
编辑-因此,如果前几行的“县”列中的县为变更前的奥尔巴尼,阿勒格尼,布朗克斯,则变更后为奥尔巴尼,阿勒格尼,纽约。
# clean up county names
# 5 boroughs must be combined to New York City
# eliminate the word county
nyCounties = ["Kings", "Queens", "Bronx", "Richmond", "New York"]
nypopdf['County'] = ['New York' for nypopdf['County'] in nyCounties else
nypopdf['County']]
答案 0 :(得分:1)
一个小模型:
In [44]: c = ['c', 'g']
In [45]: df = pd.DataFrame({'county': list('abccdefggh')})
In [46]: df['county'] = df['county'].where(~df['county'].isin(c), 'N')
In [47]: df
Out[47]: county
0 a
1 b
2 N
3 N
4 d
5 e
6 f
7 N
8 N
9 h
因此使用pd.Series.where ~df['county'].isin(c)
选择不在列表c
中的行(开始时的~
是'not'操作),第二个参数是要替换的值(条件为False时)。
适合您的示例:
nypopdf['County'] = nypopdf['County'].where(~nypopdf['County'].isin(nyCounties), 'New York')
或
nypopdf['County'].where(~nypopdf['County'].isin(nyCounties), 'New York', inplace=True)
完整示例:
nypopdf = pd.DataFrame({'County': ['Albany', 'Allegheny', 'Bronx']})
nyCounties = ["Kings", "Queens", "Bronx", "Richmond", "New York"]
print(nypopdf)
County
0 Albany
1 Allegheny
2 Bronx
nypopdf['County'].where(~nypopdf['County'].isin(nyCounties), 'New York', inplace=True)
print(nypopdf)
County
0 Albany
1 Allegheny
2 New York