使用data.frame更新另一个(字符向量)

时间:2016-03-01 17:44:26

标签: r indexing dataframe

我已阅读过对此问题的回复:

Using one data.frame to update another

答案似乎有效,但仅适用于整数。我正在尝试更新某个名为“original”的数据框:

original = data.frame( Name = c("Drug A","Drug B") , Id = c( 1 , 2) , Value1 = c("Yellow",NA), Value2 = c(NA,"Blue") )

使用替换data.frame来覆盖某些值:

replacement = data.frame( Name = c("Drug B") , Id = 2 , Value1 = "Red" , Value2 = "Orange")

最终应该看起来像这样:

goal = data.frame( Name = c("Drug A","Drug B") , Id = c( 1 , 2) , Value1 = c("Yellow","Red"), Value2 = c(NA,"Orange") )

与上一个问题一样,该解决方案应适用于任意长度和/或大小的表。有什么想法吗?

1 个答案:

答案 0 :(得分:0)

与all.x = TRUE合并并替换其中任一值列为非NA的行的值:

goal <- merge(original, replacement, by=c("Name", "Id") ,all.x=TRUE)
goal
#----
    Name Id Value1.x Value2.x Value1.y Value2.y
1 Drug A  1   Yellow     <NA>     <NA>     <NA>
2 Drug B  2     <NA>     Blue      Red   Orange

goal [ !is.na(goal$Value1.y)&!is.na(goal$Value2.y), c("Value1.x", "Value2.x")] <-
 goal [ !is.na(goal$Value1.y)&!is.na(goal$Value2.y), c("Value1.y", "Value2.y")]
 goal
 #------
Name Id Value1.x Value2.x Value1.y Value2.y
1 Drug A  1   Yellow     <NA>     <NA>     <NA>
2 Drug B  2      Red   Orange      Red   Orange

 goal[-(5:6)]
 #------
    Name Id Value1.x Value2.x
1 Drug A  1   Yellow     <NA>
2 Drug B  2      Red   Orange

如前所述,这需要用字符向量而不是因子来完成。我的数据设置避免了这个新手陷阱(在我无数次陷入其中之后)使用stringsAsFactors = TRUE:

 (original = data.frame( Name = c("Drug A","Drug B") , Id = c( 1 , 2) , 
                         Value1 = c("Yellow",NA), Value2 = c(NA,"Blue") , 
                         stringsAsFactors=FALSE))

(replacement = data.frame( Name = c("Drug B") , Id = 2 , Value1 = "Red" , 
                           Value2 = "Orange" , stringsAsFactors=FALSE))

(goal = data.frame( Name = c("Drug A","Drug B") , Id = c( 1 , 2) , 
                    Value1 = c("Yellow","Red"), Value2 = c(NA,"Orange") , 
                    stringsAsFactors=FALSE ))

这可以在全局范围内完成(在data.frame或其他数据输入调用之前)。有些R商店一直运行这个选项,如果你用R进行数据管理,那么我认为将它视为你的运作方式是明智的:

  options(stringsAsFactors=FALSE)