在一个data.frame中查找值并从其他列传输值

时间:2014-12-05 15:28:12

标签: r

我不知道我是否能够正确解释它,但我想要实现的目标非常简单。

这是第一个data.frame。对我来说重要的是第一栏“V1”

    > dput(Data1)
structure(list(V1 = c(10L, 5L, 3L, 9L, 1L, 2L, 6L, 4L, 8L, 7L
), V2 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "NA", class = "factor"), 
    V3 = c(18L, 17L, 13L, 20L, 15L, 12L, 16L, 11L, 14L, 19L)), .Names = c("V1", 
"V2", "V3"), row.names = c(NA, -10L), class = "data.frame")

第二个data.frame:

   > dput(Data2)
structure(list(Names = c(9L, 10L, 6L, 4L, 2L, 7L, 5L, 3L, 1L, 
8L), Herat = c(30L, 29L, 21L, 25L, 24L, 22L, 28L, 27L, 23L, 26L
), Grobpel = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L), .Label = "NA", class = "factor"), Hassynch = c(19L, 12L, 
15L, 20L, 11L, 13L, 14L, 16L, 18L, 17L)), .Names = c("Names", 
"Herat", "Grobpel", "Hassynch"), row.names = c(NA, -10L), class = "data.frame"

)

第一个data.frame的值可以在第1列中找到,我想从4列(Hassynch)复制该值并将其放在第一个data.frame的第二列中。

如何以最快的方式完成?

2 个答案:

答案 0 :(得分:1)

library(dplyr)
left_join(Data1, Data2, by=c("V1"="Names"))
#    V1 V2 V3 Herat Grobpel Hassynch
# 1  10 NA 18    29      NA       12
# 2   5 NA 17    28      NA       14
# 3   3 NA 13    27      NA       16
# 4   9 NA 20    30      NA       19
# 5   1 NA 15    23      NA       18
# 6   2 NA 12    24      NA       11
# 7   6 NA 16    21      NA       15
# 8   4 NA 11    25      NA       20
# 9   8 NA 14    26      NA       17
# 10  7 NA 19    22      NA       13

# if you don't want V2 and V3, you could
left_join(Data1, Data2, by=c("V1"="Names")) %>%
  select(-V2, -V3)
#    V1 Herat Grobpel Hassynch
# 1  10    29      NA       12
# 2   5    28      NA       14
# 3   3    27      NA       16
# 4   9    30      NA       19
# 5   1    23      NA       18
# 6   2    24      NA       11
# 7   6    21      NA       15
# 8   4    25      NA       20
# 9   8    26      NA       17
# 10  7    22      NA       13

答案 1 :(得分:0)

这是我前一段时间用来说明merge的玩具示例。来自dplyr的left_join也很好,而data.table几乎肯定有另一个选择。

您可以对参考数据框进行子集化,使其仅包含键变量和值变量,这样您就不会得到无法管理的数据帧。

id<-as.numeric((1:5))
m<-c("a","a","a","","")
n<-c("","","b","b","b")
dfm<-data.frame(cbind(id,m))
head(dfm)
  id m
1  1 a
2  2 a
3  3 a
4  4  
5  5  
dfn<-data.frame(cbind(id,n))
head(dfn)
  id n
1  1  
2  2  
3  3 b
4  4 b
5  5 b

dfm$id<-as.numeric(dfm$id)
dfn$id<-as.numeric(dfn$id)

dfm<-subset(dfm,id<4)
head(dfm)
  id m
1  1 a
2  2 a
3  3 a

dfn<-subset(dfn,id!=1 & id!=2)
head(dfn)
  id n
3  3 b
4  4 b
5  5 b

df.all<-merge(dfm,dfn,by="id",all=TRUE)
head(df.all)
  id    m    n
1  1    a <NA>
2  2    a <NA>
3  3    a    b
4  4 <NA>    b
5  5 <NA>    b

df.all.m<-merge(dfm,dfn,by="id",all.x=TRUE)
head(df.al.lm)
  id m    n
1  1 a <NA>
2  2 a <NA>
3  3 a    b

df.all.n<-merge(dfm,dfn,by="id",all.y=TRUE)
head(df.all.n)
  id    m n
1  3    a b
2  4 <NA> b
3  5 <NA> b