我有2个数据帧,其中一个包含缺失值。第一个数据帧如下:
data <- data.frame(Name = c("Dex","Dex","Rex","Rex","Rex","Lex","Lex","Nex","Nex","Nex"),
Year = c(2000, 2001, 2000, 2001, 2002, 2001, 2002, NA, 2001, 2002))
# Name Year
# DEX 2000
# DEX 2001
# REX 2000
# REX 2002
# REX 2002
# LEX 2001
# LEX 2002
# NEX NA
# NEX 2001
# NEX 2002
第二个数据帧:
data1 <- data.frame(Name = c("Nex","Nex","Nex"), Year = c(2000, 2001, 2002))
# Name Year
# NEX 2000
# NEX 2001
# NEX 2002
我想用数据帧data
中适当放置的值替换数据帧data1
中的丢失值。
结果应为:
# Name Year
# DEX 2000
# DEX 2001
# REX 2000
# REX 2002
# REX 2002
# LEX 2001
# LEX 2002
# NEX 2000
# NEX 2001
# NEX 2002
用data
中的3行替换data1
中称为NEX的3行,或者以某种方式合并两个数据帧以使{{1 }}与data1
的适当行合并。但是,我不知道该怎么做。
答案 0 :(得分:0)
可以结合使用left_join
和anti_join
(来自dplyr
)。
首先,我正在使用character
而不是factor
加载数据,因为修复问题在行绑定中可能会发生冲突。
data <- data.frame(Name = c("Dex","Dex","Rex","Rex","Rex","Lex","Lex", "Nex","Nex","Nex"),
Year = c(2000, 2001, 2000, 2001, 2002, 2001, 2002, NA, 2001, 2002 ),
stringsAsFactors = FALSE)
data1 <- data.frame(Name = c("Nex","Nex","Nex"), Year = c(2000, 2001, 2002),
stringsAsFactors = FALSE)
现在喜欢的东西:
library(dplyr)
data %>%
filter(is.na(Year)) %>%
select(-Year) %>%
left_join(data1, by = "Name") %>%
anti_join(data, by = c("Name", "Year")) %>%
bind_rows(filter(data, !is.na(Year)))
# Name Year
# 1 Nex 2000
# 2 Dex 2000
# 3 Dex 2001
# 4 Rex 2000
# 5 Rex 2001
# 6 Rex 2002
# 7 Lex 2001
# 8 Lex 2002
# 9 Nex 2001
# 10 Nex 2002
由于我没有对任何东西重新排序,所以订单有些少,但是您可以使用arrange
轻松地解决该问题。
答案 1 :(得分:0)
假设这里的顺序有意义(按数据中的组和data1中的独立组),则可以添加一个id列进行连接。
data <- data.frame( Name = c("Dex","Dex","Rex","Rex","Rex","Lex","Lex",
"Nex","Nex","Nex"), Year = c(2000, 2001, 2000, 2001, 2002, 2001, 2002, NA, 2001, 2002 ))
data1 <- data.frame( Name = c("Nex","Nex","Nex"), Year = c(2000, 2001, 2002))
data <- data %>%
group_by(Name) %>%
mutate( # creating index by groups to join
Index = 1:n()
) %>%
ungroup()
data1 <- data1 %>%
mutate( # index, no groups
Index = 1:n()
)
dataFill <- data %>%
left_join(data1, by = c("Name", "Index")) %>%
mutate( # if_else will help us fill in values that are missing selectively
YearComplete = if_else(
is.na(Year.x),
Year.y,
Year.x
)
)
答案 2 :(得分:0)
这个怎么样? 假定: (1)您始终具有不带NA的NEX数据表。 (2)顺序始终与带有NA的数据表相同
data$Year[data$Name == "Nex" ] <- data1$Year
答案 3 :(得分:0)
我认为有一个简单的方法可以做到这一点。鉴于我们可能不知道顺序的全部相关性,首先过滤掉Name
中“ Nex”的实例,然后用bind_rows
将一个数据框堆叠在另一个数据框上:
library(tidyverse)
data %>%
filter(Name != "Nex") %>%
bind_rows(data1)
Name Year
1 Dex 2000
2 Dex 2001
3 Rex 2000
4 Rex 2001
5 Rex 2002
6 Lex 2001
7 Lex 2002
8 Nex 2000
9 Nex 2001
10 Nex 2002