pandas数据帧中单元格中的条件更改值

时间:2018-05-13 14:03:42

标签: python pandas dataframe

我想将open,high和low的NaN值替换为close。但是,仅当更改为0.00

时才会应用此条件

以下是我的代码

try:
    url = 'https://api.iextrading.com/1.0/stock/AAME/chart/1y'
    q_data = pd.read_json(url)
    if q_data.change == 0.00:
        q_data.open = q_data.close
        q_data.high = q_data.close
        q_data.low = q_data.close
except Exception:
    print "No data"
    continue

问题是try循环被绕过并转到except循环。 如何正确更改数据?

1 个答案:

答案 0 :(得分:2)

我建议通过广播在numpy中使用带有mask的非循环解决方案和链式布尔掩码:

df = pd.DataFrame({'close':[100] * 6,
                   'open':[4,5,4,5,np.nan,4],
                   'high':[np.nan,8,9,4,2,3],
                   'low':[1,3,5,7,np.nan,np.nan],
                   'change':[0,3,6,9,0,4],
                   'col':[np.nan]*6})

print (df)
   change  close  col  high  low  open
0       0    100  NaN   NaN  1.0   4.0
1       3    100  NaN   8.0  3.0   5.0
2       6    100  NaN   9.0  5.0   4.0
3       9    100  NaN   4.0  7.0   5.0
4       0    100  NaN   2.0  NaN   NaN
5       4    100  NaN   3.0  NaN   4.0

cols = ['open', 'high', 'low']
m  =  df[cols].isnull().values & (df['change'] == 0).values[:, None]

df[cols] = df[cols].mask(m, df['close'], axis=0)
#numpy alternative
#df[cols] = np.where(m, df['close'].values[:, None], df[cols])

print (df)
   change  close  col   high    low   open
0       0    100  NaN  100.0    1.0    4.0
1       3    100  NaN    8.0    3.0    5.0
2       6    100  NaN    9.0    5.0    4.0
3       9    100  NaN    4.0    7.0    5.0
4       0    100  NaN    2.0  100.0  100.0
5       4    100  NaN    3.0    NaN    4.0

<强>解释

问题链boolen DataFrame带有boolen Series,收到错误:

m  =  df[cols].isnull() & (df['change'] == 0)

ValueError: operands could not be broadcast together with shapes (18,) (3,) 

解决方案位于numpy broadcasting

print (df[cols].isnull().values)
[[False  True False]
 [False False False]
 [False False False]
 [False False False]
 [ True False  True]
 [False False  True]]

print ((df['change'] == 0).values)
[ True False False False  True False]

因此有必要创建N x 1数组:

print ((df['change'] == 0).values[:, None])
[[ True]
 [False]
 [False]
 [False]
 [ True]
 [False]]

m  =  df[cols].isnull().values & (df['change'] == 0).values[:, None]
print (m)
[[False  True False]
 [False False False]
 [False False False]
 [False False False]
 [ True False  True]
 [False False False]]