如何将numpy fillna()与numpy.where()一起用于pandas DataFrame中的列?

时间:2017-12-27 02:55:35

标签: python pandas numpy dataframe fillna

以下是pandas DataFrame示例:

jwt.decode(token)

我正在使用import pandas as pd import numpy as np dict1 = {'file': ['filename2', 'filename2', 'filename3', 'filename4', 'filename4', 'filename3'], 'amount': [3, 4, 5, 1, 2, 1], 'front': [21889611, 36357723, 196312, 11, 42, 1992], 'back':[21973805, 36403870, 277500, 19, 120, 3210], 'type':['A', 'A', 'A', 'B', 'B', 'C']} df1 = pd.DataFrame(dict1) print(df1) file amount front back type 0 filename2 3 21889611 21973805 A 1 filename2 4 36357723 36403870 A 2 filename3 5 196312 277500 A 3 filename4 1 11 19 B 4 filename4 2 42 120 B 5 filename3 1 1992 3210 C

定义新列end
numpy.where()

如果df1['end'] = np.where(df1['type']=='B', df1['front'], df1['front'] + df1['back']) print(df1) amount back file front type end 0 3 21973805 filename2 21889611 A 43863416 1 4 36403870 filename2 36357723 A 72761593 2 5 277500 filename3 196312 A 473812 3 1 19 filename4 11 B 11 4 2 120 filename4 42 B 42 5 1 3210 filename3 1992 C 5202 列部分存在,我想使用相同的方法填写NaN值,例如这是一个end,其中DataFrame作为列存在,但有许多end值。 (编辑:这些非NA的值可能完全是唯一的):

NaN

我认为可以使用new_df amount back file front type end 0 3 21973805 filename2 21889611 A NaN 1 4 36403870 filename2 36357723 A NaN 2 5 277500 filename3 196312 A 12 3 1 19 filename4 11 B NaN 4 2 120 filename4 42 B 49 5 1 3210 filename3 1992 C NaN 执行此操作,但这会引发错误:

pandas.DataFrame.fillna()

问题:如何仅在特定列中的df1['end'].fillna(np.where(df1['type']=='B', df1['front'], df1['front'] + df1['back']), inplace=True) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/local/lib/python3.6/site-packages/pandas/core/series.py", line 2434, in fillna **kwargs) File "/usr/local/lib/python3.6/site-packages/pandas/core/generic.py", line 3631, in fillna type(value)) ValueError: invalid fill value with a <class 'numpy.ndarray'> 值上有效使用np.where()

1 个答案:

答案 0 :(得分:1)

fillna基于索引

df['New']=np.where(df1['type']=='B', df1['front'], df1['front'] + df1['back'])
df
Out[125]: 
   amount      back       file     front type       end       New
0       3  21973805  filename2  21889611    A       NaN  43863416
1       4  36403870  filename2  36357723    A       NaN  72761593
2       5    277500  filename3    196312    A  473812.0    473812
3       1        19  filename4        11    B       NaN        11
4       2       120  filename4        42    B      42.0        42
5       1      3210  filename3      1992    C       NaN      5202
df.end.fillna(df.New)
Out[126]: 
0    43863416.0
1    72761593.0
2      473812.0
3          11.0
4          42.0
5        5202.0
Name: end, dtype: float64
df.end=df.end.fillna(df.New)
df
Out[128]: 
   amount      back       file     front type         end       New
0       3  21973805  filename2  21889611    A  43863416.0  43863416
1       4  36403870  filename2  36357723    A  72761593.0  72761593
2       5    277500  filename3    196312    A    473812.0    473812
3       1        19  filename4        11    B        11.0        11
4       2       120  filename4        42    B        42.0        42
5       1      3210  filename3      1992    C      5202.0      5202

更新

df['New']=np.where(df1['type']=='B', df1['front'], df1['front'] + df1['back'])
df.end=df.end.fillna(df.New)
df
Out[133]: 
   amount      back       file     front type         end       New
0       3  21973805  filename2  21889611    A  43863416.0  43863416
1       4  36403870  filename2  36357723    A  72761593.0  72761593
2       5    277500  filename3    196312    A        12.0    473812
3       1        19  filename4        11    B        11.0        11
4       2       120  filename4        42    B        49.0        42
5       1      3210  filename3      1992    C      5202.0      5202