Rprogramming>>以特定方式组合数据帧

时间:2016-06-28 13:20:49

标签: r

我有两个数据框

数据框1:

day <- c(0,3,6)
score <- c(5,11,17)
studentName <- c("Matt","Sam","Jeff")
state <- c("MA","MD","MO")
city <- c("Worcester","Silver Spring","Creve Couer")
zipCode <- c(41441,20865,61341)

dataFrame1 <- data.frame(day,score,studentName,state,city,zipCode)

数据框2:

day <- c(0,0.5,1,1.5,2,2.5,3,3.5,4,4.5,5,5.5,6)
score <- c(5,6,7,8,9,10,11,12,13,14,15,16,17)
studentName <- c(rep("Matt", 6),rep("Sam", 6),"Jeff")

dataFrame2 <- data.frame(day,score,studentName)

数据框3:

dataFrame3 <- merge(dataFrame1,dataFrame2, all = TRUE, by = c("studentName","day","score"))

Dataframe 3有NA&s;我想以编程方式实现dataframe4的输出。以下dataframe4的示例输出:

day <- c(0,0.5,1,1.5,2,2.5,3,3.5,4,4.5,5,5.5,6)
score <- c(5,6,7,8,9,10,11,12,13,14,15,16,17)
studentName <- c(rep("Matt", 6),rep("Sam", 6),"Jeff")
state <- c(rep("MA", 6),rep("MD", 6),"MO")
city <- c(rep("Worcester", 6),rep("Silver Spring", 6),"Creve Couer")
zipCode <- c(rep(41441, 6),rep(20865, 6),61341)

dataFrame4 <- data.frame(day,score,studentName,state,city,zipCode)

2 个答案:

答案 0 :(得分:0)

这样做的一种方法是使用sqldf

library(sqldf)
sqldf(" select a.*, b.state, b.city , b.zipCode from dataframe2 a , dataframe1 b where a.studentName = b.studentName ")

答案 1 :(得分:0)

df4 <- merge(dataFrame2, dataFrame1[,-c(1:2)], by="studentName", all.x = T)
df4[with(df4, order(day)), c(2,3,1,4:6)]