将一个数据帧中的行替换为另一数据帧中的行

时间:2019-02-16 05:38:01

标签: r

我有2个数据帧,其中一个包含缺失值。第一个数据帧如下:

data <- data.frame(Name = c("Dex","Dex","Rex","Rex","Rex","Lex","Lex","Nex","Nex","Nex"),
                   Year = c(2000, 2001, 2000, 2001, 2002, 2001, 2002, NA, 2001, 2002))

# Name  Year
# DEX   2000
# DEX   2001
# REX   2000
# REX   2002
# REX   2002
# LEX   2001
# LEX   2002
# NEX    NA
# NEX   2001
# NEX   2002

第二个数据帧:

data1 <- data.frame(Name = c("Nex","Nex","Nex"), Year = c(2000, 2001, 2002))

# Name  Year
# NEX   2000
# NEX   2001
# NEX   2002

我想用数据帧data中适当放置的值替换数据帧data1中的丢失值。

结果应为:

# Name  Year
# DEX   2000
# DEX   2001
# REX   2000
# REX   2002
# REX   2002
# LEX   2001
# LEX   2002
# NEX   2000
# NEX   2001
# NEX   2002

data中的3行替换data1中称为NEX的3行,或者以某种方式合并两个数据帧以使{{1 }}与data1的适当行合并。但是,我不知道该怎么做。

4 个答案:

答案 0 :(得分:0)

可以结合使用left_joinanti_join(来自dplyr)。

首先,我正在使用character而不是factor加载数据,因为修复问题在行绑定中可能会发生冲突。

data <- data.frame(Name = c("Dex","Dex","Rex","Rex","Rex","Lex","Lex", "Nex","Nex","Nex"),
                   Year = c(2000, 2001, 2000, 2001, 2002, 2001, 2002, NA, 2001, 2002 ),
                   stringsAsFactors = FALSE)
data1 <- data.frame(Name = c("Nex","Nex","Nex"), Year = c(2000, 2001, 2002),
                    stringsAsFactors = FALSE)

现在喜欢的东西:

library(dplyr)
data %>%
  filter(is.na(Year)) %>%
  select(-Year) %>%
  left_join(data1, by = "Name") %>%
  anti_join(data, by = c("Name", "Year")) %>%
  bind_rows(filter(data, !is.na(Year)))
#    Name Year
# 1   Nex 2000
# 2   Dex 2000
# 3   Dex 2001
# 4   Rex 2000
# 5   Rex 2001
# 6   Rex 2002
# 7   Lex 2001
# 8   Lex 2002
# 9   Nex 2001
# 10  Nex 2002

由于我没有对任何东西重新排序,所以订单有些少,但是您可以使用arrange轻松地解决该问题。

答案 1 :(得分:0)

假设这里的顺序有意义(按数据中的组和data1中的独立组),则可以添加一个id列进行连接。

data <- data.frame( Name = c("Dex","Dex","Rex","Rex","Rex","Lex","Lex",
                             "Nex","Nex","Nex"), Year = c(2000, 2001, 2000, 2001, 2002, 2001, 2002, NA, 2001, 2002 ))

data1 <- data.frame( Name = c("Nex","Nex","Nex"), Year = c(2000, 2001, 2002))

data <- data %>%
  group_by(Name) %>% 
  mutate( # creating index by groups to join
    Index = 1:n()
  ) %>% 
  ungroup()

data1 <- data1 %>% 
  mutate( # index, no groups
    Index = 1:n()
  )

dataFill <- data %>% 
  left_join(data1, by = c("Name", "Index")) %>% 
  mutate( # if_else will help us fill in values that are missing selectively
    YearComplete = if_else(
      is.na(Year.x),
      Year.y,
      Year.x
    )
  )

答案 2 :(得分:0)

这个怎么样? 假定: (1)您始终具有不带NA的NEX数据表。 (2)顺序始终与带有NA的数据表相同

data$Year[data$Name == "Nex" ] <- data1$Year

答案 3 :(得分:0)

我认为有一个简单的方法可以做到这一点。鉴于我们可能不知道顺序的全部相关性,首先过滤掉Name中“ Nex”的实例,然后用bind_rows将一个数据框堆叠在另一个数据框上:

library(tidyverse)

data %>% 
  filter(Name != "Nex") %>%
  bind_rows(data1)

   Name Year
1   Dex 2000
2   Dex 2001
3   Rex 2000
4   Rex 2001
5   Rex 2002
6   Lex 2001
7   Lex 2002
8   Nex 2000
9   Nex 2001
10  Nex 2002