我已阅读过对此问题的回复:
Using one data.frame to update another
答案似乎有效,但仅适用于整数。我正在尝试更新某个名为“original”的数据框:
original = data.frame( Name = c("Drug A","Drug B") , Id = c( 1 , 2) , Value1 = c("Yellow",NA), Value2 = c(NA,"Blue") )
使用替换data.frame来覆盖某些值:
replacement = data.frame( Name = c("Drug B") , Id = 2 , Value1 = "Red" , Value2 = "Orange")
最终应该看起来像这样:
goal = data.frame( Name = c("Drug A","Drug B") , Id = c( 1 , 2) , Value1 = c("Yellow","Red"), Value2 = c(NA,"Orange") )
与上一个问题一样,该解决方案应适用于任意长度和/或大小的表。有什么想法吗?
答案 0 :(得分:0)
与all.x = TRUE合并并替换其中任一值列为非NA的行的值:
goal <- merge(original, replacement, by=c("Name", "Id") ,all.x=TRUE)
goal
#----
Name Id Value1.x Value2.x Value1.y Value2.y
1 Drug A 1 Yellow <NA> <NA> <NA>
2 Drug B 2 <NA> Blue Red Orange
goal [ !is.na(goal$Value1.y)&!is.na(goal$Value2.y), c("Value1.x", "Value2.x")] <-
goal [ !is.na(goal$Value1.y)&!is.na(goal$Value2.y), c("Value1.y", "Value2.y")]
goal
#------
Name Id Value1.x Value2.x Value1.y Value2.y
1 Drug A 1 Yellow <NA> <NA> <NA>
2 Drug B 2 Red Orange Red Orange
goal[-(5:6)]
#------
Name Id Value1.x Value2.x
1 Drug A 1 Yellow <NA>
2 Drug B 2 Red Orange
如前所述,这需要用字符向量而不是因子来完成。我的数据设置避免了这个新手陷阱(在我无数次陷入其中之后)使用stringsAsFactors = TRUE:
(original = data.frame( Name = c("Drug A","Drug B") , Id = c( 1 , 2) ,
Value1 = c("Yellow",NA), Value2 = c(NA,"Blue") ,
stringsAsFactors=FALSE))
(replacement = data.frame( Name = c("Drug B") , Id = 2 , Value1 = "Red" ,
Value2 = "Orange" , stringsAsFactors=FALSE))
(goal = data.frame( Name = c("Drug A","Drug B") , Id = c( 1 , 2) ,
Value1 = c("Yellow","Red"), Value2 = c(NA,"Orange") ,
stringsAsFactors=FALSE ))
这可以在全局范围内完成(在data.frame
或其他数据输入调用之前)。有些R商店一直运行这个选项,如果你用R进行数据管理,那么我认为将它视为你的运作方式是明智的:
options(stringsAsFactors=FALSE)