根据间隔合并两个数据集

时间:2017-06-29 02:45:10

标签: merge

我想知道如何通过a和b合并这两个数据集。 f数据集中的a列是间隔的下限,因此我需要从g数据集合并1.5,其中f为1,4.4来自g,其中4来自f,9.8来自g,其中9来自f等。

a<-seq(1:10)
b<-c("a","b","a","b","a","a","a","b","b","a")
f<-data.frame(a,b)

a<-c(1.5,1.4,2.3,2.2,4.4,4,5,6.6,9.8,4.1,4.6,5.5)
b<-c("a","b","b","b","a","b","a","b","a","b","a","b")
m<-seq(1:12)
g<-data.frame(a,b,m)

1 个答案:

答案 0 :(得分:0)

不确定您在这里找到了什么,但floor()功能应该可以满足您的需求。您也可以查看tidyverse,尤其是dplyr,尤其是数据操作。

您对输出的期望并不完全清楚 - b列在合并后略有不同 - 您是否只想要匹配的记录?如果您不关心不匹配的记录,请删除all.xall.y参数。我还假设重命名列可能是有序的:

a <- seq(1:10)
b <- c("a", "b", "a", "b", "a", "a", "a", "b", "b", "a")
f <- data.frame(a, b)

a <- c(1.5, 1.4, 2.3, 2.2, 4.4, 4, 5, 6.6, 9.8, 4.1, 4.6, 5.5)
b <- c("a", "b", "b", "b", "a", "b", "a", "b", "a", "b", "a", "b")
m <- seq(1:12)
g <- data.frame(a, b, m)

## floor function takes care of rounding down
g$c <- floor(g$a)

merge(f, g, by.x = "a", by.y = "c", all.x = TRUE, all.y = TRUE)
#> Warning in merge.data.frame(f, g, by.x = "a", by.y = "c", all.x = TRUE, :
#> column name 'a' is duplicated in the result
#>     a b.x   a  b.y  m
#> 1   1   a 1.5    a  1
#> 2   1   a 1.4    b  2
#> 3   2   b 2.3    b  3
#> 4   2   b 2.2    b  4
#> 5   3   a  NA <NA> NA
#> 6   4   b 4.4    a  5
#> 7   4   b 4.0    b  6
#> 8   4   b 4.6    a 11
#> 9   4   b 4.1    b 10
#> 10  5   a 5.5    b 12
#> 11  5   a 5.0    a  7
#> 12  6   a 6.6    b  8
#> 13  7   a  NA <NA> NA
#> 14  8   b  NA <NA> NA
#> 15  9   b 9.8    a  9
#> 16 10   a  NA <NA> NA