迭代DataFrame时设置值

时间:2015-07-21 17:27:16

标签: python numpy pandas

我有一本州的字典(例如IA:Idaho)。我已将字典加载到DataFrame bystate_df

然后我导入一个带有状态死亡的CSV,当我读到这些行时,我想将它们添加到bystate_df

byState_df = pd.DataFrame(states.items())
byState_df['Deaths'] = 0
df['Deaths'] = df['Deaths'].convert_objects(convert_numeric=True)
print byState_df
for index, row in df.iterrows():
    if row['Area'] in states:
           byState_df[(byState_df[0] == row['Area'])]['Deaths'] = row['Deaths']         

print byState_df

但是byState_df仍为0之后:

      0                         1  Deaths
 0   WA                Washington       0
 1   WI                 Wisconsin       0
 2   WV             West Virginia       0
 3   FL                   Florida       0
 4   WY                   Wyoming       0
 5   NH             New Hampshire       0
 6   NJ                New Jersey       0
 7   NM                New Mexico       0
 8   NA                  National       0

我在迭代时测试row['Deaths']并且它产生了正确的值,它似乎错误地设置了byState_df值。

1 个答案:

答案 0 :(得分:1)

您可以尝试使用.loc代替[][]的以下代码。

byState_df = pd.DataFrame(states.items())
byState_df['Deaths'] = 0
df['Deaths'] = df['Deaths'].convert_objects(convert_numeric=True)
print byState_df
for index, row in df.iterrows():
    if row['Area'] in states:
           byState_df.loc[byState_df[0] == row['Area'], 'Deaths'] = row['Deaths']         

print byState_df