根据R中的多个相似变量创建变量

时间:2018-06-29 14:20:51

标签: r loops if-statement

我的数据如下所示(变量zipid1-zipid13和变量hospid的范围是1-13:

  zipid1 zipid2 zipid3 zipid4 zipid5 zipid6 zipid7 zipid8 zipid9 zipid10 zipid11 zipid12 zipid13 hospid local
1      0      0      0      0      1      0      0      0      0       0       0       0       0      5     0
2      0      0      1      0      1      0      0      0      0       0       0       0       0      5     0
3      0      0      0      0      0      0      1      0      0       0       0       0       0      5     0
4      0      0      1      0      0      0      0      0      0       0       0       0       0      5     0
5      0      0      1      0      1      0      0      0      0       0       0       0       0      5     0
6      0      0      0      0      1      0      0      0      0       0       0       0       0      5     0

zipid1 ==1 & hospid =1, zipid2 == 1 & hospid == 2等时,如何创建 local 变量= 1。否则= 0(即zipid = hospid)?

我尝试了ifelse,但效果不佳。

for (i in 1:13) {
name = paste0("zipid", i)
local$local <- with(local, ifelse(name == 1 & hospid == i, 1, 0))
}

谢谢!

3 个答案:

答案 0 :(得分:1)

这是一个想法:

<tr class='elementC'>
  <td>Element C form header</td>
  <!-- Some specific form element here... -->
</tr>
<table class = 'dynamic-row-container"> 
  <tr class='elementA'>
    <td>Element A form header</td>
    <!-- Some specific form element here... -->
  </tr>
</table>

给予

df$local <-  unlist(lapply(1:nrow(df), function(x)df[x, paste("zipid", df$hospid, sep = "")[x]]))

它们起作用的方式是,我在# zipid1 zipid2 zipid3 zipid4 zipid5 zipid6 zipid7 zipid8 zipid9 zipid10 zipid11 zipid12 zipid13 hospid local # 1 0 0 0 0 1 0 0 0 0 0 0 0 0 5 1 # 2 0 0 1 0 1 0 0 0 0 0 0 0 0 5 1 # 3 0 0 0 0 0 0 1 0 0 0 0 0 0 5 0 # 4 0 0 1 0 0 0 0 0 0 0 0 0 0 5 0 # 5 0 0 1 0 1 0 0 0 0 0 0 0 0 5 1 # 6 0 0 0 0 1 0 0 0 0 0 0 0 0 5 1 的每一行中取值,然后将其与hospid粘贴以制作类似zipid的值。我在与特定行相对应的特定列中查找值,并检查其是否为zipid5


如果数据框中存在1,则可以使用NA将其删除。例如,na.omit在运行上面的代码之前。

答案 1 :(得分:1)

问题在于列名zipid1zipid2等传达的是有效载荷数据,即数字。

我的建议是将数据从宽到长整形,从列名中提取数字,与hospid匹配,由id聚合,然后将结果与原始宽合并格式。

使用toString()进行汇总,以便在出现多个匹配项时获得有效结果。

library(data.table)
# reshape from wide to long format
melt(setDT(DT), id.vars = c("id", "hospid"), variable.name = "zipid")[
  # turn column names into integer
  , zipid := as.integer(stringr::str_replace(zipid, "zipid", ""))][
    # if value is 1 and zipid and hospid do match then store number
    value == 1L & zipid == hospid, local := hospid][
      # aggregate only mathcing entries by id
      !is.na(local), .(local = toString(local)), by = id][
        # right join with original data
        DT, on = "id"][
          # change column order to meet OP's expectation
          , setcolorder(.SD, names(DT))]
   id zipid1 zipid2 zipid3 zipid4 zipid5 zipid6 zipid7 zipid8 zipid9 zipid10 zipid11 zipid12 zipid13 hospid local
1:  1      0      0      0      0      1      0      0      0      0       0       0       0       0      5     5
2:  2      0      0      1      0      1      0      0      0      0       0       0       0       0      5     5
3:  3      0      0      0      0      0      0      1      0      0       0       0       0       0      5  <NA>
4:  4      0      0      1      0      0      0      0      0      0       0       0       0       0      5  <NA>
5:  5      0      0      1      0      1      0      0      0      0       0       0       0       0      5     5
6:  6      0      0      0      0      1      0      0      0      0       0       0       0       0      5     5

编辑

通过重塑,DT中的相关信息可以压缩为

melt(setDT(DT), id.vars = c("id", "hospid"), variable.name = "zipid")[
  , zipid := as.integer(stringr::str_replace(zipid, "zipid", ""))][
    value == 1L]
   id hospid zipid value
1:  2      5     3     1
2:  4      5     3     1
3:  5      5     3     1
4:  1      5     5     1
5:  2      5     5     1
6:  5      5     5     1
7:  6      5     5     1
8:  3      5     7     1

结果由

给出
melt(setDT(DT), id.vars = c("id", "hospid"), variable.name = "zipid")[
  , zipid := as.integer(stringr::str_replace(zipid, "zipid", ""))][
    value == 1L][
      zipid == hospid]
   id hospid zipid value
1:  1      5     5     1
2:  2      5     5     1
3:  5      5     5     1
4:  6      5     5     1

因此,要将其与原始数据对象结合起来,我们可以对join进行更新:

tmp <- 
  melt(setDT(DT), id.vars = c("id", "hospid"), variable.name = "zipid")[
    , zipid := as.integer(stringr::str_replace(zipid, "zipid", ""))][
      value == 1L & zipid == hospid]
DT[tmp, on = "id", local := value][]
   id zipid1 zipid2 zipid3 zipid4 zipid5 zipid6 zipid7 zipid8 zipid9 zipid10 zipid11 zipid12 zipid13 hospid local
1:  1      0      0      0      0      1      0      0      0      0       0       0       0       0      5     1
2:  2      0      0      1      0      1      0      0      0      0       0       0       0       0      5     1
3:  3      0      0      0      0      0      0      1      0      0       0       0       0       0      5    NA
4:  4      0      0      1      0      0      0      0      0      0       0       0       0       0      5    NA
5:  5      0      0      1      0      1      0      0      0      0       0       0       0       0      5     1
6:  6      0      0      0      0      1      0      0      0      0       0       0       0       0      5     1

这给出了预期的输出。无需聚合。

数据

library(data.table)
DT <- fread("id        zipid1 zipid2 zipid3 zipid4 zipid5 zipid6 zipid7 zipid8 zipid9 zipid10 zipid11 zipid12 zipid13 hospid local
1      0      0      0      0      1      0      0      0      0       0       0       0       0      5     0
2      0      0      1      0      1      0      0      0      0       0       0       0       0      5     0
3      0      0      0      0      0      0      1      0      0       0       0       0       0      5     0
4      0      0      1      0      0      0      0      0      0       0       0       0       0      5     0
5      0      0      1      0      1      0      0      0      0       0       0       0       0      5     0
6      0      0      0      0      1      0      0      0      0       0       0       0       0      5     0", drop = "local")

答案 2 :(得分:-1)

name是字符串的向量,在这种情况下,它被解释为字符串,而不是变量,请尝试ifelse(get(name)==1 &...