Question

我有两个数据库，我需要从这两个数据库中编译信息。

比方说第一个（Db1）是这样的：

Col1    Col2    Col3  
P1      2000    Type1    
P1      2000    Type2
P1      2001    Type2
P2      2000    Type1
P2      2001    Type1
P3      2003    Type3

第二个（Db2）就像（Col3以外的类似值仅获得type4值）：

Col1    Col2    Col3  
P1      2000    Type4    
P1      2000    Type4
P1      2001    Type4
P2      2000    Type4
P2      2001    Type4
P3      2003    Type4

我想通过Type1，2和3创建新数据库，但是要通过Type4和Col1加入Col2。首先，我只需要用Col3子集Db1来获得类型1、2或3。

然后，我要转到Db2，以获取具有与{b1中的Col1相同的Col2和Type1值的所有行。因此，对于P1-2000，P2-2000和P2-2001的组合，我只需要Type4值（因此，由Type1过滤）；但是我该怎么子集呢？

预期的输出（对于Type1）：

Col1    Col2    Col3  
P1      2000    Type1    
P2      2000    Type1
P2      2001    Type1
P1      2000    Type4    
P1      2000    Type4
P2      2000    Type4
P2      2001    Type4

Answer 1

仅使用R

lines =
'Col1    Col2    Col3  
  P1      2000    Type1    
  P1      2000    Type2
  P1      2001    Type2
  P2      2000    Type1
  P2      2001    Type1
  P3      2003    Type3'

Db1 = read.table(textConnection(lines), header = T)


lines =
'Col1    Col2    Col3  
  P1      2000    Type4    
  P1      2000    Type4
  P1      2001    Type4
  P2      2000    Type4
  P2      2001    Type4
  P3      2003    Type4'

Db2 = read.table(textConnection(lines), header = T)


#Filtering data Db1
Db1_new = Db1[Db1$Col3=='Type1', ]

#Filtering data Db2
Db1_f = Db1_new[!duplicated(Db1_new[,-3]), ] 
Db2_new = data.frame(Col1=NULL, Col2=NULL,  Col3=NULL)

for (i in 1:nrow(Db1_f)) {
  aux = Db2[Db2$Col1 == Db1_f$Col1[i] & Db2$Col2 == Db1_f$Col2[i], ]
  Db2_new = rbind(Db2_new, aux)
}


#Db1 merge with Db2
rbind(Db1_new, Db2_new)

#   Col1 Col2  Col3
#1    P1 2000 Type1
#4    P2 2000 Type1
#5    P2 2001 Type1
#11   P1 2000 Type4
#2    P1 2000 Type4
#41   P2 2000 Type4
#51   P2 2001 Type4

如何考虑另一个数据库的值对数据库进行子集化？

1 个答案: