I have a dataframe as show below
df =
value
2014-05-21 10:00:00 0.0
2014-05-21 11:00:00 3.4
2014-05-21 12:00:00 nan
2014-05-21 13:00:00 0.0
2014-05-21 14:00:00 nan
2014-05-21 15:00:00 1.0
..............
I would like to add two columns,
first one named "active" to switch the value to 1 (if df.value >=0 )and 0 (if df.value = nan), and the second one "unactive" to switch the value to 0 (if df.value >=0 )and -1 (if df.value = nan),so the new dataframe would be like
df_new =
value active unactive
2014-05-21 10:00:00 0.0 1 0
2014-05-21 11:00:00 3.4 1 0
2014-05-21 12:00:00 nan 0 -1
2014-05-21 13:00:00 0.0 1 0
2014-05-21 14:00:00 nan 0 -1
2014-05-21 15:00:00 1.0 1 0
............
I try to use for loop, but it takes too much time when the time series is long. Does anyone know a better way to do it ? thanks for advance!
答案 0 :(得分:2)
您可以使用df.value >= 0
并使用astype(int)
:
In [44]: df['active'], df['inactive'] = (df.value >= 0).astype(int), -(~(df.value >= 0)).astype(int)
In [45]: df
Out[45]:
value active inactive
2014-05-21 10:00:00 0.0 1 0
11:00:00 3.4 1 0
12:00:00 NaN 0 -1
13:00:00 0.0 1 0
14:00:00 NaN 0 -1
15:00:00 1.0 1 0
答案 1 :(得分:1)
df['active'] = df['value'].notnull().astype(int)
和
df['unactive'] = -df['value'].isnull().astype(int)
(同样,当df.value< 0并且不是nan时,你没有指定'active'应该是什么'。应该'活跃'是?1,-1,不关心?)