数据

Question

我在data.frame对象中有位置索引向量，但在每个data.frame对象中，位置索引向量的顺序非常不同。但是，我希望将这些data.frame对象对象集成/合并到一个具有非常特定顺序的常见data.frame中，并且不允许在其中进行复制。有没有人知道更容易做到这一点的诀窍？任何人都可以提出可行的方法来完成这项任务吗？

数据

v1 <- data.frame(
  foo=c(1,2,3),
  bar=c(1,2,2),
  bleh=c(1,3,0))

v2 <-  data.frame(
  bar=c(1,2,3),
  foo=c(1,2,0),
  bleh=c(3,3,4))

v3 <-  data.frame(
  bleh=c(1,2,3,4),
  foo=c(1,1,2,0),
  bar=c(0,1,2,3))

积分后的

初始输出：

initial_output <- data.frame(
  foo=c(1,2,3,1,2,0,1,1,2,0),
  bar=c(1,2,2,1,2,3,0,1,2,3),
  bleh=c(1,3,0,3,3,4,1,2,3,4)
)

删除重复

rmDuplicate_output <- data.frame(
  foo=c(1,2,3,1,0,1,1),
  bar=c(1,2,2,1,3,0,1),
  bleh=c(1,3,0,3,4,1,2)
)

最终所需输出：

final_output <- data.frame(
  foo=c(1,1,1,1,2,3,0),
  bar=c(0,1,1,1,2,2,3),
  bleh=c(1,1,2,3,3,0,4)
)

如何轻松获得最终所需的输出？有没有有效的方法对data.frame对象进行这种操作？感谢

Answer 1

您还可以使用mget / ls组合使用以编程方式获取数据框（无需键入单个名称），然后使用data.table s rbindlist和{ {1}}提高效率的功能/方法（请参阅here和here）

unique

作为旁注，通常最好将多个library(data.table) unique(rbindlist(mget(ls(pattern = "v\\d+")), use.names = TRUE)) # foo bar bleh # 1: 1 1 1 # 2: 2 2 3 # 3: 3 2 0 # 4: 1 1 3 # 5: 0 3 4 # 6: 1 0 1 # 7: 1 1 2保存在一个列表中，以便您可以更好地控制它们

Answer 2

这是一个解决方案：

# combine dataframes
df = rbind(v1, v2, v3)

# remove duplicated
df = df[! duplicated(df),]

# sort by 'bar' column
df[order(df$bar),]
    foo bar bleh
7   1   0    1
1   1   1    1
4   1   1    3
8   1   1    2
2   2   2    3
3   3   2    0
6   0   3    4

Answer 3

我们可以使用bind_rows中的dplyr，使用distinct和arrange删除重复项＆＃39; bar＆＃39;

library(dplyr)
bind_rows(v1, v2, v3) %>%
             distinct %>%
             arrange(bar)
#    foo bar bleh
#1   1   0    1
#2   1   1    1
#3   1   1    3
#4   1   1    2
#5   2   2    3
#6   3   2    0
#7   0   3    4

如何将多个data.frame中的矢量集合成一个没有重复的一个？

数据

初始输出：

删除重复

最终所需输出：

3 个答案: