如果列的值为Nat,则通过减去两个日期来更新它

时间:2017-11-29 10:36:32

标签: python pandas numpy

我有一个如下所示的数据框:

enter image description here

我要做的是检查days_diff是否使用numpy和pandas进行NaT,如果是NaT,则通过减去" 2016-01-01"来更新它。通过不合时宜的时间。运行以下代码后:

df[['days_diff']] = np.where(pd.isnull(df[['days_diff']]), df[['outofservicedatetime']] - np.datetime64('2016-01-01'), df[['days_diff']])

我得到的输出如下:

enter image description here

我如何将days_diff值设为天?或者,如果任何人都可以建议更容易实现这一点,那将同样有用。

1 个答案:

答案 0 :(得分:0)

通过使用library(dplyr); library(tidyr); library(dummies) df2 <- df %>% separate_rows(amenities, sep = ",") df2$amenities <- trimws(df2$amenities, "both") # remove spaces (left and right) - so that you will not have 2 "pool" columns in your final data frame df2 <- dummy.data.frame(df2)[, -2] colnames(df2) <- trimws(gsub("amenities", "", colnames(df2)), "both") # arrange colnames df3 <- df2 %>% group_by(id) %>% summarise_all(funs(sum)) ## aggregate by column and id df3 # A tibble: 5 x 7 #id `air conditioning` dryer kitchen pool washer `wireless internet` #<dbl> <int> <int> <int> <int> <int> <int> # 1 1 0 1 1 0 1 # 2 0 1 1 1 1 0 # 3 0 1 1 0 0 1 # 4 0 0 0 0 0 0 # 5 0 0 0 0 0 1 ,您可以在[df.loc[df['days_diff'].isnull()...上获得比速度提高两倍的速度,可选地使用参数&#39; inplace = True&#39;复制pd.Series.fillna的行为。

df.loc