在对pandas数据帧进行子集化时忽略KeyError

时间:2017-04-25 18:31:51

标签: python pandas

我有一个pandas dataframe df,其中包含city1,city2,city3,city4,city5列。我有一个列表my_cities = [“city1”,“city3”,“city10”]。我想根据my_cities中的列来子集df。当我这样做时,

my_cities = [“city1”,“city3”,“city10”]

df_my_cities = df [my_cities]

我收到错误KeyError:“['city10']不在索引”

如果my_cities中的某个元素不在df中,我怎么能告诉代码继续进行?

1 个答案:

答案 0 :(得分:4)

您可以在所有列和list之间使用intersection

df_my_cities = df[df.columns.intersection(my_cities)]

样品:

df = pd.DataFrame({'city1':['s', 'e'],
                   'city2':['e','f'],
                   'city3':['f','g'],
                   'city4':['r','g'],
                   'city5':['t','m']})

print (df)
  city1 city2 city3 city4 city5
0     s     e     f     r     t
1     e     f     g     g     m

my_cities = ["city1","city3","city10"]
df_my_cities = df[df.columns.intersection(my_cities)]
print (df_my_cities)
  city1 city3
0     s     f
1     e     g

另外numpy.intersect1d

df_my_cities = df[np.intersect1d(df.columns, my_cities)]
print (df_my_cities)
  city1 city3
0     s     f
1     e     g