透过行和列,但在R中的PASSFAIL列上有条件

时间:2015-08-04 20:34:50

标签: r dataframe pivot reshape tidyr

我有一个像这样的数据框

NUM <- c(1,2,3,1,2,3,1,2,3,1)
ID <- c("DJ45","DJ45","DJ45","DJ46","DJ46","DJ46","DJ47","DJ47","DJ47","DJ48")
Type <- c("A", "F", "C", "B", "D", "A", "E", "C", "F", "D")
Points1 <- c(9.2,60.8,22.9,1012.7,18.7,11.1,67.2,63.1,16.7,58.4)
Points2 <- c(19.2,0.8,2.9,12.7,188.7,114.1,7.2,66.1,46.7,508.4)
PASSFAIL <- c("PASS", "PASS", "FAIL", "PASS", "FAIL", "PASS", "PASS", "FAIL", "FAIL", "FAIL")

df1 <- data.frame(ID,NUM,Type,Points1,Points2,PASSFAIL)

df1:

    ID NUM Type Points1 Points2 PASSFAIL
1  DJ45   1    A     9.2    19.2     PASS
2  DJ45   2    F    60.8     0.8     PASS
3  DJ45   3    C    22.9     2.9     FAIL
4  DJ46   1    B    12.7    12.7     PASS
5  DJ46   2    D    18.7   188.7     FAIL
6  DJ46   3    A    11.1   114.1     PASS
7  DJ47   1    E    67.2     7.2     PASS
8  DJ47   2    C    63.1    66.1     FAIL
9  DJ47   3    F    16.7    46.7     FAIL
10 DJ48   1    D    58.4   508.4     FAIL

我正在尝试获取一个输出,该输出采用&#34; Type&#34;使用PASSFAIL列上的条件将值转换为单个列的列,如果是(如果PASSFAIL = PASS,则使用Points1列的值,如果PASSFAIL = FAIL,则使用Points2列中的值)

我想要的输出是

    ID   NUM   A    B    C    D    E    F    PASSFAIL 
1  DJ45   1    9.2  NA   NA   NA   NA   NA     PASS
2  DJ45   2    NA   NA   NA   NA   NA   60.8   PASS
3  DJ45   3    NA   NA   2.9  NA   NA   NA     FAIL
4  DJ46   1    NA 1012.7 NA   NA   NA   NA     PASS
5  DJ46   2    NA   NA   NA  188.7 NA   NA     FAIL
6  DJ46   3    11.1 NA   NA   NA   NA   NA     PASS
7  DJ47   1    NA   NA   NA   NA   67.2 NA     PASS
8  DJ47   2    NA   NA   66.1 NA   NA   NA     FAIL
9  DJ47   3    NA   NA   NA   NA   NA   46.7   FAIL
10 DJ48   1    NA   NA   NA  508.4 NA   NA     FAIL

我尝试这样做,但我只是不知道如何在此代码中使用我的条件。

df1 %>%
  group_by(ID, NUM) %>%
  mutate(id2 = sequence(n())) %>%
  spread(Type, Points1)

有人可以帮帮我吗?

1 个答案:

答案 0 :(得分:3)

reshape2的一种方式:

df1$pick <- ifelse(df1$PASSFAIL == "PASS", df1$Points1, df1$Points2)
newdf <- dcast(df1, ID+NUM~Type, value.var="pick")
data.frame(newdf, PASSFAIL=df1[,"PASSFAIL"])
#      ID NUM    A      B    C     D    E    F PASSFAIL
# 1  DJ45   1  9.2     NA   NA    NA   NA   NA     PASS
# 2  DJ45   2   NA     NA   NA    NA   NA 60.8     PASS
# 3  DJ45   3   NA     NA  2.9    NA   NA   NA     FAIL
# 4  DJ46   1   NA 1012.7   NA    NA   NA   NA     PASS
# 5  DJ46   2   NA     NA   NA 188.7   NA   NA     FAIL
# 6  DJ46   3 11.1     NA   NA    NA   NA   NA     PASS
# 7  DJ47   1   NA     NA   NA    NA 67.2   NA     PASS
# 8  DJ47   2   NA     NA 66.1    NA   NA   NA     FAIL
# 9  DJ47   3   NA     NA   NA    NA   NA 46.7     FAIL
# 10 DJ48   1   NA     NA   NA 508.4   NA   NA     FAIL

有助于创建一个新列,其中的数据将被放入&#34; wide&#34;格式。我们使用ifelsePoints1选择"PASS",为Points2选择"PASS"。在dcastID列上调用NUM以使用&#34; Type&#34;柱。 PASSFAIL列附加到重塑输出以完成。