在R中合并xts-将字符转换为NA

时间:2019-02-25 09:52:07

标签: r xts

我有3个xts对象

logged <- xts::xts(x = loggedInUsers$loggedInUsers, order.by = Sys.time())
loadValue <- xts::xts(x = loadAvg, order.by = Sys.time())
hostname <- xts::xts(x = loadHost, order.by = Sys.time())

dput(hostname)
dput(loadValue)
dput(logged)

dput给出以下结果

 structure("deliverforgoodportal", .Dim = c(1L, 1L), index = structure(1551088127.27724, tzone = "", tclass = c("POSIXct",
    "POSIXt")), class = c("xts", "zoo"), .indexCLASS = c("POSIXct",
    "POSIXt"), tclass = c("POSIXct", "POSIXt"), .indexTZ = "", tzone = "")

structure(0, .Dim = c(1L, 1L), .Dimnames = list(NULL, "load"), index = structure(1551088127.27676, tzone = "", tclass = c("POSIXct",
"POSIXt")), .indexCLASS = c("POSIXct", "POSIXt"), tclass = c("POSIXct",
"POSIXt"), .indexTZ = "", tzone = "", class = c("xts", "zoo"))

structure(1, .Dim = c(1L, 1L), index = structure(1551088127.27637, tzone = "", tclass = c("POSIXct",
"POSIXt")), class = c("xts", "zoo"), .indexCLASS = c("POSIXct",
"POSIXt"), tclass = c("POSIXct", "POSIXt"), .indexTZ = "", tzone = "")

当我将这三个合并并打印时,主机名将转换为NA

  tmp <- merge.xts(hostname, logged, loadValue, all = TRUE)
    print(tmp)

输出为:(主机名为NA)

                    hostname logged  load
2019-02-25 09:48:47       NA      1    NA
2019-02-25 09:48:47       NA     NA    0
2019-02-25 09:48:47       NA     NA    NA

为什么会以NA的形式出现?

1 个答案:

答案 0 :(得分:1)

您应该意识到xts对象是一个时间序列和一个矩阵。现在,矩阵只能包含一种类型的值,即字符或数字。但不是两者。您的合并尝试将字符值矩阵(主机名)与数字值(记录和加载)组合在一起。这导致主机名值被强制为NA。

如果要加入此数据,则必须使用data.frame(或data.table)。另请注意,您的时间值不相等,以毫秒为单位。因此,如果您想参加会议,请首先使用lubridate软件包中的floor_date。请参见以下两个带有和不带有润滑脂的示例。我使用包timetk将xts对象转换为小标题,但取决于您的源数据可能不是必需的。

具有full_join,无润滑作用

library(timetk)
library(dplyr)
hostname <- tk_tbl(hostname)
loadValue <- tk_tbl(loadValue)
logged <- tk_tbl(logged)

hostname %>% 
  full_join(loadValue) %>% 
  full_join(logged, 
            by = "index", 
            suffix = c("_hostname", "_logged"))

Joining, by = "index"
# A tibble: 3 x 4
  index               value_hostname        load value_logged
  <dttm>              <chr>                <dbl>        <dbl>
1 2019-02-25 10:48:47 deliverforgoodportal    NA           NA
2 2019-02-25 10:48:47 NA                       0           NA
3 2019-02-25 10:48:47 NA                      NA            1

具有润滑和左联接:

hostname %>% 
  mutate(index = lubridate::floor_date(index, unit = "seconds")) %>% 
  left_join(loadValue %>% mutate(index = lubridate::floor_date(index, unit = "seconds"))) %>% 
  left_join(logged %>% mutate(index = lubridate::floor_date(index, unit = "seconds")), 
            by = "index", 
            suffix = c("_hostname", "_logged"))    

Joining, by = "index"
# A tibble: 1 x 4
  index               value_hostname        load value_logged
  <dttm>              <chr>                <dbl>        <dbl>
1 2019-02-25 10:48:47 deliverforgoodportal     0            1