通过将列名称连接到行,将一个表中的缺失值替换为另一个表中的值

时间:2017-03-16 08:56:06

标签: r missing-data

我试图通过将列名称连接到行来将一个表中的缺失值替换为另一个表中的值。以下是一个例子:

DF1

A  B  C  D
1  3  4  *
4  *  5  9
0  *  2  *
1  2  *  4

DF2

Variable  Value
A  2
B  1
C  9
D  0

结果数据集:

A  B  C  D
1  3  4  0
4  1  5  9
0  1  2  0
1  2  9  4

4 个答案:

答案 0 :(得分:4)

使用stackunstack

的其他选项
d1 <- stack(df)
d1$values[d1$values == '*'] <- df1$Value[match(d1$ind, df1$Variable)][d1$values == '*']
unstack(d1, values ~ ind)
#  A B C D
#1 1 3 4 0
#2 4 1 5 9
#3 0 1 2 0
#4 1 2 9 4

数据

dput(df)
structure(list(A = c(1, 4, 0, 1), B = c("3", "*", "*", "2"), 
    C = c("4", "5", "2", "*"), D = c("*", "9", "*", "4")), .Names = c("A", 
"B", "C", "D"), row.names = c(NA, -4L), class = "data.frame")

dput(df1)
structure(list(Variable = c("A", "B", "C", "D"), Value = c(2L, 
1L, 9L, 0L)), .Names = c("Variable", "Value"), row.names = c(NA, 
-4L), class = "data.frame")

答案 1 :(得分:2)

我们可以使用Map

df1[as.character(df2$Variable)] <- Map(function(x, y)
    replace(x, is.na(x), y), df1[as.character(df2$Variable)], df2$Value)

如果值不是NA而只是*那么

df1[as.character(df2$Variable)] <- Map(function(x, y)
    replace(x, x=="*", y), df1[as.character(df2$Variable)], df2$Value)
df1
#  A B C D
#1 1 3 4 0
#2 4 1 5 9
#3 0 1 2 0
#4 1 2 9 4

如果数据集&#39; df1&#39;不是性格,那么

df1[] <- as.matrix(df1)

数据

df1 <- structure(list(A = c(1L, 4L, 0L, 1L), B = c("3", "*", "*", "2"
 ), C = c("4", "5", "2", "*"), D = c("*", "9", "*", "4")), .Names = c("A", 
 "B", "C", "D"), class = "data.frame", row.names = c(NA, -4L))
df2 <- structure(list(Variable = c("A", "B", "C", "D"), Value = c(2L, 
 1L, 9L, 0L)), .Names = c("Variable", "Value"), class = "data.frame",
  row.names = c(NA, -4L))

答案 2 :(得分:2)

找出“*”的列名,并将其与Variable中的df2列相匹配,并提取相应的Value

x <- which(df1=="*", arr.ind = TRUE)
df1[x] <- df2$Value[match(names(df1)[x[, 2]], df2$Variable)]

#  A B C D
#1 1 3 4 0
#2 4 1 5 9
#3 0 1 2 0
#4 1 2 9 4

假设您在df1中有字符列,如果它们没有按

转换它们
df1[] <- lapply(df1, as.character)

答案 3 :(得分:1)

我们可以创建查找表,然后在匹配时更新:

# make a lookup table same size as df1
df2Lookup <-
  matrix(rep(df2$Value, nrow(df1)), nrow = nrow(df1), byrow = TRUE)

# then update on "*"
df1[ df1 == "*" ] <- df2Lookup[ df1 == "*" ]

#result
df1
#   A B C D
# 1 1 3 4 0
# 2 4 1 5 9
# 3 0 1 2 0
# 4 1 2 9 4