我正在合并两个data.frames dat1
和dat2
,temp
并且合并未提供dat2
的所有值。为什么来自dat2
的值无法正确合并?
示例数据
dat1 <- data.frame(temp = seq(0, 33.2, 0.1))
dat2 <- structure(list(temp = c(6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 6.9, 7,
7.1, 7.2, 7.3, 7.4, 7.5, 7.6, 7.7, 7.8, 7.9, 8, 8.1, 8.2, 8.3,
8.4, 8.5, 8.6, 8.7, 8.8, 8.9, 9, 9.1, 9.2, 9.3, 9.4, 9.5, 9.6,
9.7, 9.8, 9.9, 10, 10.1, 10.2, 10.3, 10.4, 10.5, 10.6, 10.7,
10.8, 10.9, 11, 11.1, 11.2, 11.3, 11.4, 11.5, 11.6, 11.7, 11.8,
11.9, 12, 12.1, 12.2, 12.3, 12.4, 12.5, 12.6, 12.7, 12.8, 12.9,
13, 13.1, 13.2), pprox = c(193.53, 626.8, 1055.04, 1478.24,
1896.41, 2309.55, 2717.64, 3120.69, 3518.7, 3911.66, 4299.58,
4682.45, 5060.26, 5433.03, 5800.74, 6163.39, 6520.99, 6873.53,
7221.01, 7563.43, 7900.78, 8233.07, 8560.3, 8882.46, 9199.56,
9511.59, 9818.55, 10120.44, 10417.27, 10709.03, 10995.71, 11277.33,
11553.88, 11825.36, 12091.78, 12353.13, 12609.41, 12860.63, 13106.78,
13347.87, 13583.89, 13814.86, 14040.76, 14261.61, 14477.41, 14688.14,
14893.83, 15094.47, 15290.05, 15480.59, 15666.09, 15846.55, 16021.96,
16192.34, 16357.68, 16517.98, 16673.26, 16823.51, 16968.73, 17108.93,
17244.1, 17374.25, 17499.38, 17619.5, 17734.6, 17844.68, 17949.76,
18049.82, 18144.87, 18234.91)), row.names = c(NA, 70L), class = "data.frame")
合并
dat <- left_join(dat1, dat2, by = "temp")
输出
dat[65:70, ]
temp approx
65 6.4 626.80
66 6.5 1055.04
67 6.6 NA
68 6.7 1896.41
69 6.8 NA
70 6.9 2717.64
答案 0 :(得分:3)
有趣的是identical(dat2$temp[4],6.6 )
会返回TRUE
,但identical(dat1$temp[67],6.6)
会返回FALSE
。
浮点问题是一个已知问题,请查看许多其他类似帖子中的Why are these numbers not equal?或floating point issue in R?。
如果设置dat1 <- data.frame(temp = round(seq(0, 33.2, 0.1), 2))
,则应解决此问题。可能会将?all.equal
作为all.equal(dat1$temp[67],6.6 )
结帐
是TRUE
答案 1 :(得分:2)
我将两个数据框中的temp
列转换为一个因子,然后将它们连接在一起。它有效!
dat1$temp <- as.factor(dat1$temp)
dat2$temp <- as.factor(dat2$temp)
dat <- left_join(dat1, dat2, by = "temp")