对于数据框df
,如果列b
中的值为空2017-01-01
或a
中的值,我试图用值NaNs
填充列Others
df = pd.DataFrame({'a':['Coffee','Muffin','Donut','Others',pd.np.nan, pd.np.nan]})
a
0 Coffee
1 Muffin
2 Donut
3 Others
4 NaN
5 NaN
:
a b
0 Coffee 2017-01-01
1 Muffin 2017-01-01
2 Donut 2017-01-01
3 Others NaN
4 NaN NaN
5 NaN NaN
预期结果如下:
NaNs
我尝试过的不排除df.loc[~df['a'].isin(['nan', 'Others']), 'b'] = '2017-01-01'
a b
0 Coffee 2017-01-01
1 Muffin 2017-01-01
2 Donut 2017-01-01
3 Others NaN
4 NaN 2017-01-01
5 NaN 2017-01-01
的内容:
SELECT
company_name,
warehouse_city,
COUNT(*) AS total_orders,
SUM(CASE WHEN arrival_status = 'cancelled' THEN 1 ELSE 0 END) AS cancelled_orders
FROM
your_table
GROUP BY
company_name,
warehouse_city
HAVING
COUNT(*) < 5 * SUM(CASE WHEN arrival_status = 'cancelled' THEN 1 ELSE 0 END)
谢谢!
答案 0 :(得分:2)
使用np.nan
代替nan
:
df.loc[~df['a'].isin([np.nan, 'Others']), 'b'] = '2017-01-01'
或者在比较之前,用Others
替换缺失值:
df.loc[~df['a'].fillna('Others').eq('Others'), 'b'] = '2017-01-01'
print (df)
a b
0 Coffee 2017-01-01
1 Muffin 2017-01-01
2 Donut 2017-01-01
3 Others NaN
4 NaN NaN
5 NaN NaN
答案 1 :(得分:1)
检查一下:
import numpy as np
import pandas as pd
df = pd.DataFrame({'a': ['Coffee', 'Muffin', 'Donut', 'Others', pd.np.nan, pd.np.nan]})
conditions = [
(df['a'] == 'Others'),
(df['a'].isnull())
]
choices = [np.nan, np.nan]
df['color'] = np.select(conditions, choices, default='2017-01-01')
print(df)
答案 2 :(得分:0)
import pandas as pd
import numpy as np
df = pd.DataFrame({'a':['Coffee','Muffin','Donut','Others',pd.np.nan, pd.np.nan]})
df.loc[df['a'].replace('Others',np.nan).notnull(),'b'] = '2017-01-01'
print(df)