Create new dataframe column with 0 and 1 values according to given series

时间:2018-04-20 21:28:54

标签: python pandas dataframe boolean nan

I have a dataframe as show below

df = 
                     value
2014-05-21 10:00:00    0.0
2014-05-21 11:00:00    3.4
2014-05-21 12:00:00    nan
2014-05-21 13:00:00    0.0
2014-05-21 14:00:00    nan
2014-05-21 15:00:00    1.0
..............

I would like to add two columns,

first one named "active" to switch the value to 1 (if df.value >=0 )and 0 (if df.value = nan), and the second one "unactive" to switch the value to 0 (if df.value >=0 )and -1 (if df.value = nan),so the new dataframe would be like

df_new = 
                     value   active  unactive
2014-05-21 10:00:00    0.0        1         0
2014-05-21 11:00:00    3.4        1         0
2014-05-21 12:00:00    nan        0        -1
2014-05-21 13:00:00    0.0        1         0
2014-05-21 14:00:00    nan        0        -1
2014-05-21 15:00:00    1.0        1         0
............

I try to use for loop, but it takes too much time when the time series is long. Does anyone know a better way to do it ? thanks for advance!

2 个答案:

答案 0 :(得分:2)

您可以使用df.value >= 0并使用astype(int)

In [44]: df['active'], df['inactive'] = (df.value >= 0).astype(int), -(~(df.value >= 0)).astype(int)

In [45]: df
Out[45]:
                     value  active  inactive
2014-05-21 10:00:00    0.0       1         0
           11:00:00    3.4       1         0
           12:00:00    NaN       0        -1
           13:00:00    0.0       1         0
           14:00:00    NaN       0        -1
           15:00:00    1.0       1         0

答案 1 :(得分:1)

df['active'] = df['value'].notnull().astype(int)

df['unactive'] = -df['value'].isnull().astype(int)

(同样,当df.value< 0并且不是nan时,你没有指定'active'应该是什么'。应该'活跃'是?1,-1,不关心?)

相关问题