R - 根据值的组合将值分配给多个单元格

时间:2015-05-05 20:19:05

标签: r

我有以下data.frame,其中多个X列(1,2,3 ... N)为空白:

df1 <- data.frame( name = c("A","B","C"), 
                   X1 = c("","", ""), 
                   Y1 = c("aa","bb","cc"), 
                   Z1 = c("AA","BB","CC"),
                   X2 = c("","", ""), 
                   Y2 = c("dd","",""),
                   Z2 = c("AA","",""),
                   X3 = c("","", ""), 
                   Y3 = c("","","ee"), 
                   Z3 = c("","","CC"))

另一个data.frame包含应根据Ys和Zz列中观察到的值组合分配给X列的值:

df2 <- data.frame( Y = c("aa","bb","cc","dd","ee"), 
                   Z = c("AA","BB","CC","AA","CC"),
                   X = c (1,2,3,4,5))

我如何根据df2上的信息在df1中指定X的值,所以我可以得到df3?:

df3 <- data.frame( name = c("A","B","C"), 
                   X1 = c("1","2", "3"), 
                   Y1 = c("aa","bb","cc"), 
                   Z1 = c("AA","BB","CC"),
                   X2 = c("4","", ""), 
                   Y2 = c("dd","",""),
                   Z2 = c("AA","",""),
                   X3 = c("","", "5"), 
                   Y3 = c("","","ee"), 
                   Z3 = c("","","CC"))`

请注意,在我的真实数据库中,每个名称可能包含,但不一定包含多个列(例如,X1,Y1,Z1... X10,Y10,Z10)。

1 个答案:

答案 0 :(得分:2)

此策略将您的数据从宽格式重新整形为长格式,进行合并,然后重新整形所有内容。

# go from wide to long
x1 <- reshape(df1, 
    varying=Map(function(x) paste0(x, 1:3), c("X","Y","Z")),
    v.names=c("X","Y","Z"),
    idvar="name",    
    timevar="time",
    direction="long")

x2 <- merge(subset(x1, select=-X), df2, by=c("Y","Z"), all.x=T)
# replace NA values with blanks
x2[is.na(x2$X),"X"] <- ""

# go back to wide
x3 <- reshape(x2,idvar="name",direction="wide", sep="")
然后

x3

  name Y1 Z1 X1 Y2 Z2 X2 Y3 Z3 X3
1    A aa AA  1 dd AA  4         
2    B bb BB  2                  
3    C cc CC  3          ee CC  5

在这里,您可以按照略有不同的顺序获取列,但如果需要,您可以在事后轻松修复。

你可以看到有一个我硬编码的地方1:3。如果您有更多的列重复,则可以调整该向量。