我有一个看起来像这样的联接数据框:
DF <- structure(list(OpenUser = c(11111, 11111, 11111, 11111, 11111,
11111), OpenFirstName = c("Sigal", "Sigal", "Sigal", "Sigal",
"Sigal", "Sigal"), OpenLastName = c("segal", "segal", "segal",
"segal", "segal", "segal"), CRMEventStartDate = structure(c(1430524800,
1430524800, 1435881600, 1435881600, 1425168000, 1425168000), class = c("POSIXct",
"POSIXt"), tzone = "UTC"), CustomerID = c(7033, 7033, 7033, 7033,
9040, 9040), Application = c("Incoming Call", "Incoming Call",
"Incoming Call", "Incoming Call", "Incoming Call", "Incoming Call"
), CustomerType = c("Private", "Private", "Private", "Private",
"Private", "Private"), CampaignStrategyID = c(121212, 512345,
121212, 512345, 512345, 516345), ResponseDate = structure(c(1435881600,
1430524800, 1435881600, 1430524800, 1425168000, 1430870400), class = c("POSIXct",
"POSIXt"), tzone = "UTC"), ResponseCode = c(3, 1, 3, 1, 3, 1),
days = c(62, 0, 0, -62, 0, 66)), class = c("tbl_df", "tbl",
"data.frame"), row.names = c(NA, -6L))
该数据框有两个问题:
1)相同的两个日期之间的差异时间返回0,我需要它返回1。
2)这是一个联接的数据帧。不知何故,我的联接返回了不需要的行,您可以在其中看到"CRMEventstartdate"
在"ResponseDate"
之后之后。回复日期应该总是是同一天或同一天或之后,否之前。为什么会发生这种情况,我该如何预防呢?
连接的两个数据帧是:
Calls <- structure(list(OpenUser = c(11111, 11111, 11111, 11111, 11111,
11111), OpenFirstName = c("Sigal", "Sigal", "Sigal", "Sigal",
"Sigal", "Sigal"), OpenLastName = c("segal", "segal", "segal",
"segal", "segal", "segal"), CRMEventStartDate = structure(c(1430524800,
1435881600, 1425168000, 1438473600, 1417478400, 1435881600), class = c("POSIXct",
"POSIXt"), tzone = "UTC"), CustomerID = c(7033, 7033, 9040, 17472,
35099, 39778), Application = c("Incoming Call", "Incoming Call",
"Incoming Call", "Incoming Call", "Incoming Call", "Incoming Call"
), CustomerType = c("Private", "Private", "Private", "Private",
"Private", "Private")), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
AND
Response <- structure(list(CampaignStrategyID = c(512345, 512345, 512345,
121212, 512345, 121212), CustomerID = c(836, 1070, 1390, 2970,
3479, 3646), ResponseDate = structure(c(1441065600, 1441065600,
1431129600, 1435881600, 1420502400, 1417392000), class = c("POSIXct",
"POSIXt"), tzone = "UTC"), ResponseCode = c(1, 1, 1, 3, 2, 1)), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))
用于加入和计算时差的代码为:
DF <- inner_join(Calls,Response,by="CustomerID") %>%
mutate(days=as.numeric(difftime(ResponseDate,CRMEventStartDate,units = "days")))
答案 0 :(得分:0)
选择其中任何一个。它们将从0变为1。
df1$New_Column <- names(df1)[max.col(df1, "first")]