如何用python数据帧中的其他值替换空值单元格?

时间:2017-04-20 09:58:21

标签: python pandas dataframe where

该操作类似于MYSQL操作:

   UPDATE a.tract_201704 SET val_2000=0.91516427*val_2001 WHERE val_2001 IS NOT NULL AND val_2000 IS NULL.

我有很多列的df,其中有一个名为val_2000的列,如果这个包含空值,那么我想用0.91516427 * val_2001(标量乘法与下一个单元格)替换此值。

到目前为止

代码:(val_2000有100或无)

    df = pd.read_csv("singleDataFile_header.csv")

    df_val2001_null = (df[df['val_2000'] != '100.000000000000']['val_2001'])
    df_val2000_null = (df[df['val_2000'] != '100.000000000000']['val_2000'])
    df_val2000_null = 0.91516427*df_val2001_null

但是如果df [val_2000]中没有值,那么如何将df_val2000_null中的值恢复为原始df?

2 个答案:

答案 0 :(得分:2)

fillna正是您要找的:http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.fillna.html

df.loc[:, 'val_2000'] = df.val_2000.fillna(0.91516427 * df.val_2001)

答案 1 :(得分:1)

您可以使用combine_first

df = pd.DataFrame({'val_2000':[np.nan,2,3],
                   'val_2001':[4,5,6]})

print (df)
   val_2000  val_2001
0       NaN         4
1       2.0         5
2       3.0         6

df['val_2000'] = df['val_2000'].combine_first(0.91516427 * df['val_2001'])
print (df)
   val_2000  val_2001
0  3.660657         4
1  2.000000         5
2  3.000000         6

编辑:

可能的问题是nan是字符串,而不是NaN,或者数据是一些无效的字符串。

df = pd.DataFrame({'val_2000':['nan',100,'gggg'],
                   'val_2001':[1,1,1]})

print (df)
  val_2000  val_2001
0      nan         1
1      100         1
2     gggg         1

df['val_2000'] = pd.to_numeric(df['val_2000'], errors='coerce')
print (df)
   val_2000  val_2001
0       NaN         1
1     100.0         1
2       NaN         1

df['val_2000'] = df['val_2000'].combine_first(0.91516427 * df['val_2001'])
print (df)
     val_2000  val_2001
0    0.915164         1
1  100.000000         1
2    0.915164         1

仅限nan

df = pd.DataFrame({'val_2000':['nan',100,100],
                   'val_2001':[1,1,1]})

print (df)
  val_2000  val_2001
0      nan         1
1      100         1
2      100         1

df['val_2000'] = df['val_2000'].astype(float)
print (df)
   val_2000  val_2001
0       NaN         1
1     100.0         1
2     100.0         1