在R中查找并按列标题和ID替换

时间:2017-12-18 20:28:43

标签: r dataframe

我在解释时遇到了问题,但这是我正在努力解决的问题:

对于每个受访者,我想用Open1和Open2列中的值替换Open1和Open2。我觉得这应该是一个简单的解决方案,但我已经盯着它看了一段时间,我无法弄明白。

当前数据集:

ID | Col1 | Col2 | Col3 | Col4 | Col5 | Open1 | Open2 |

1 |富裕|买家|支付edn | Open1 |不用担心感到安全照顾父母

2 |买家|富裕| Open1 | Open2 |支付医疗费用给出原因|留下遗产

我想要实现的目标

ID | Col1 | Col2 | Col3 | Col4 | Col5 | Open1 | Open2 |

1 |富裕|买家|支付edn | 感到安全 |不用担心感到安全照顾父母

2 |买家|富裕| 给出原因 | 留下遗产 |支付医疗费用给出原因|留下遗产

数据
以下是dput格式的数据。

df1 <-
structure(list(ID = c("1", "2"), Col1 = c("be rich", "buy home"
), Col2 = c("buy home", "be rich"), Col3 = c("pay edn", "Open1"
), Col4 = c("Open1", "Open2"), Col5 = c("Not worry", "pay medical expenses"
), Open1 = c("feel secure", "give to causes"), Open2 = c("care for parents", 
"leave legacy")), .Names = c("ID", "Col1", "Col2", "Col3", "Col4", 
"Col5", "Open1", "Open2"), row.names = c(NA, -2L), class = "data.frame")

df2 <-
structure(list(ID = c("1", "2"), Col1 = c("be rich", "buy home"
), Col2 = c("buy home", "be rich"), Col3 = c("pay edn", "give to causes"
), Col4 = c("feel secure", "leave legacy"), Col5 = c("Not worry", 
"pay medical expenses"), Open1 = c("feel secure", "give to causes"
), Open2 = c("care for parents", "leave legacy")), .Names = c("ID", 
"Col1", "Col2", "Col3", "Col4", "Col5", "Open1", "Open2"), row.names = c(NA, 
-2L), class = "data.frame")

3 个答案:

答案 0 :(得分:0)

不是特别优雅,但它有效:

library(tidyr)
library(dplyr)
library(reshape2)
# Trim whitespace off of characters in the data provided. 
df1[] <- lapply(df1[], 
                trimws)


df_inter <- 
  df1 %>% 
  gather(col, value, 
         contains("Col"))

for(i in seq_along(df_inter$value)){
  if (df_inter$value[i] %in% names(df_inter)){
    df_inter$value[i] <- df_inter[[df_inter$value[i]]][i]
  }
}

df_inter %>% 
  dcast(ID ~ col,
        value.var = "value")

答案 1 :(得分:0)

也许有更简单的方法,但以下做你想要的。
首先,我将使用以dput格式发布的问题中的数据。请注意,它是使用stringsAsFactors = FALSE创建的。

df1b <- df1    # work on a copy

df1b <- t(apply(df1b[-1], 1, function(x){
              x[grep("Open1", x)] <- x["Open1"]
              x
          }))
df1b <- t(apply(df1b, 1, function(x){
              x[grep("Open2", x)] <- x["Open2"]
              x
          }))

df1b <- as.data.frame(df1b, stringsAsFactors = FALSE)
df1b <- cbind(df1[1], df1b)

identical(df1b, df2)    # check the results, it works
#[1] TRUE

在您的情况下,您可以使用真实df的名称开始df1b <- df1,并适当调整代码。

答案 2 :(得分:0)

这是另一种选择:

Values

数据:

df <- t(apply(df, 1, function(x) {
    x <- gsub("Open1", x["Open1"], x)
    gsub("Open2", x["Open2"], x)
}))