用df.where替换迭代?

时间:2018-03-22 15:59:33

标签: python pandas dataframe

您好我正在进行迭代,以便将一列的值与某个乘数相乘(如果它们与另一列中的特定值匹配)。为此,我已经有了一个正常的迭代:

Location?

不幸的是,这次迭代需要相当长的时间,所以我尝试提出另一种方法来做到这一点,所以我发现了for index, row in street_cal.iterrows(): street_cal.loc[street_cal['street_typ'] == 'motorway', 'v_length'] = street_cal['cal_length'] * 130 street_cal.loc[street_cal['street_typ'] == 'motorway_link', 'v_length'] = street_cal['cal_length'] * 130 street_cal.loc[street_cal['street_typ'] == 'trunk', 'v_length'] = street_cal['cal_length'] * 80 street_cal.loc[street_cal['street_typ'] == 'trunk_link', 'v_length'] = street_cal['cal_length'] * 80 street_cal.loc[street_cal['street_typ'] == 'primary', 'v_length'] = street_cal['cal_length'] * 50 street_cal.loc[street_cal['street_typ'] == 'primary_link', 'v_length'] = street_cal['cal_length'] * 50 street_cal.loc[street_cal['street_typ'] == 'secondary', 'v_length'] = street_cal['cal_length'] * 50 street_cal.loc[street_cal['street_typ'] == 'secondary_link', 'v_length'] = street_cal['cal_length'] * 50 street_cal.loc[street_cal['street_typ'] == 'tertiary', 'v_length'] = street_cal['cal_length'] * 50 street_cal.loc[street_cal['street_typ'] == 'tertiary_link', 'v_length'] = street_cal['cal_length'] * 50 street_cal.loc[street_cal['street_typ'] == 'road', 'v_length'] = street_cal['cal_length'] * 50 street_cal.loc[street_cal['street_typ'] == 'unclassified', 'v_length'] = street_cal['cal_length'] * 50 street_cal.loc[street_cal['street_typ'] == 'residential', 'v_length'] = street_cal['cal_length'] * 30 street_cal.loc[street_cal['street_typ'] == 'living_street', 'v_length'] = street_cal['cal_length'] * 15

引自https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.where.html

"返回一个与self相同形状的对象,其对应的条目来自self,其中cond为True,否则来自其他。 [...]

其他:标量,NDFrame或可调用

cond为False的条目将替换为其他的相应值。如果other是可调用的,则它在NDFrame上计算并应返回标量或NDFrame。 callable不能更改输入NDFrame(虽然pandas不会检查它)。

版本0.18.1中的新功能:可调用可以用作其他版本。"

根据这个我认为我可以使用这样的df.where进行与上面相同的操作:

df.where

但是,如果我只使用' living_street'做得对,所有其他人在“v_length”中包含的数字太高了。柱。我猜其他人的价值不止一次成倍增加,这就是为什么它们如此之高。但我不明白为什么。在这种情况下,street_cal['v_length'] = None street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'motorway', (street_cal['cal_length'] * v_mot), axis='index') street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'motorway_link', (street_cal['cal_length'] * v_mot), axis='index') street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'trunk', (street_cal['cal_length'] * v_tru), axis='index') street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'trunk_link', (street_cal['cal_length'] * v_tru), axis='index') street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'primary', (street_cal['cal_length'] * v_pri), axis='index') street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'primary_link', (street_cal['cal_length'] * v_pri), axis='index') street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'secondary', (street_cal['cal_length'] * v_sec), axis='index') street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'secondary_link', (street_cal['cal_length'] * v_sec), axis='index') street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'tertiary', (street_cal['cal_length'] * v_ter), axis='index') street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'tertiary_link', (street_cal['cal_length'] * v_ter), axis='index') street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'road', (street_cal['cal_length'] * v_roa), axis='index') street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'unclassified', (street_cal['cal_length'] * v_unc), axis='index') street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'residential', (street_cal['cal_length'] * v_res), axis='index') street_cal['v_length'] = street_cal.where(street_cal['street_typ'] != 'living_street', (street_cal['cal_length'] * v_liv), axis='index') 会检查列' street_typ'如果它有例如'高速公路'没有写入,所以有高速公路的行#39;在' street_typ'列应该将df.where值写入其中,在本例中为other,对吧?我想我对(street_cal['cal_length'] * v_mot)的工作方式感到有些困惑。

1 个答案:

答案 0 :(得分:3)

这是另一个建议;创建缩放地图并将其应用于pd.Series.map / replace

scaler = { 'motorway' : 130, 'motorway_link' : 130, ... }    
street_cal['v_length'] = (
      street_cal['cal_length'] * street_cal['street_typ'].map(scaler).fillna(1)
)